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5 3-HYDROXVPROPIONIC ACID AND 

OTHER ORGANIC COMPOUNDS 

FIELD OF THE INVENTION 

The invention relates to enzymes and methods that can be used to produce organic 
10 acids and related products. 

CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims priority from the following U.S. Provisional Patent 
Applications, which are herein incorporated by reference: U.S. Provisional Patent 
15 Application Serial Number 60/252,123, filed November 20, 2000; U.S. Provisional Patent 
Application-Serial Number 60/285,478, filed April 20, 2001; U.S. Provisional Patent 
Application Serial Number 60/306,727, filed July 20, 2001; and U.S. Provisional Patent 
Application Serial Number 60/317,845, filed September 7, 2001. 

20 BACKGROUND 

Organic chemicals such as organic acids, esters, and polyols can be used to 
synthesize plastic materials and other products. To meet the increasing demand for 
organic chemicals, more efficient and cost effective production methods are being 
developed which utilize raw materials based on carbohydrates rather than hydrocarbons. 

25 For example, certain bacteria have been used to produce large quantities of lactic acid 
used in the production of polylactic acid 

3-hydroxypropionicacid(3-HP)is an organic acid. Although several chemical 
synthesis routes have been described to produce 3-HP, only one biocatalytic route has 
been heretofore previously disclosed (WO 01/16346 to Suthers, et aL). 3-HP has utility 

30 for specialty synthesis and can be converted to commercially important intermediates by 
known art in the chemical industry, e.g., acrylic acid by dehydration, malonic acid by 
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oxidation, esters by esterification reactions with alcohols, and reduction to 1 ,3 
propanediol. 

SUMMARY 

5 The invention relates to methods and materials involved in producing 3- 

hydroxypropionic acid and other organic compounds {e.g., 1,3-propanediol, acrylic acid, 
polymerized acrylate, esters of acrylate L polymerized 3-HP, esters of 3-HP, and J malonic 
acid and its esters): Specifically; the invention provides nucleic acid molecules, 
polypeptides, host cells, and methods that can be used to produce 3-HP and other organic 
10 compounds such as 1 ,3-propanediol, acrylic acid, polymerized acrylate, esters of acrylate, 
polymerized 3-HP, esters of 3-HP, and malonic acid and its esters. 3-HP has potential to 
be both biologically and com mercially important For example, me nutritional industry 
can use 3-HP as a food, feed additive or preservative, while me derivatives mentioned 
above can be produced from 3-HP. The nucleic acid molecules described herein can be 
1 5 used to engineer host cells with the ability to produce 3-HP as well as other organic 

compounds such as 1,3-propanediol, acrylic acid, polymerized acrylate, esters of acrylate, 
polymerized 3-HP, and esters of 3-HP. The polypeptides described herein can be used in 
cell-free systems to make 3-HP as well as other organic compounds such as 1,3- 
propanediol, acrylic acid, polymerizeo^crylate, esters of acrylate, jwlywerized^HPrand 
20 esters of 34ffTThe host cells described herein can be used in culture systems to produce 
large quantities of 3-HP as well as other organic compounds such as 1,3-propanediol, 
acrylic acid, polymerized acrylate, esters of acrylate, polymerized 3-HP, and esters of 3- 

HP. ~ ' ^ " ' 

One aspect of the invention provides cells mat have lactyl-CoA dehydratase 

25 activity and 3-hydroxypropionyl-CoA dehydratase activity, and methods of making 
products such as those described herein by culturing at least one of the cells that have 
lactyl-CoA dehydratase activity and 3-hydroxypropionyl-CoA dehydratase activity. In 
some embodiments, the cell can also contain an exogenous nucleic acid molecule mat 
encodes one or more of the following polypeptides: a polypeptide having El activator 

30 activity; an E2 a polypeptide that is a subunit of an enzyme having lactyl-CoA 

dehydratase activity; an E2 0 polypeptide that is a subunit of an enzyme having lactyl- 
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CoA dehydratase activity; and a polypeptide having 3-hydroxypropionyl-CoA 
dehydratase activity. Additionally, the cell can have CoA transferase activity, CoA 
synthetase activity, poly hydroxyacid synthase activity, 3-hydroxypropionyI-CoA 
hydrolase activity, 3-hydroxyisobutryl-CoA hydrolase activity, and/or lipase activity. 

5 Moreover, the cell can contain at least one exogenous nucleic acid molecule that 
expresses one or more polypeptides that have CoA transferase activity, 3- 
hydroxypropionyl-CoA hydrolase activity, 3-hydroxyisobutryl-CoA hydrolase activity, 
CoA synthetase activity, poly hydroxyacid synthase activity, and/or lipase activity. 

In another embodiment of the invention, the cell that has lactyl-CoA dehydratase 

10 activity and 3-hydroxypropionyl-CoA dehydratase activity produces a product, for 
example, 3-HP, polymerized 3-HP, and/or an ester of 3-HP, such as methyl 
hydroxypropionate, ethyl hydroxypropionate, propyl hydroxypropionate, and/or butyl 
hydroxypropionate. Accordingly, the invention also provides methods of producing one 
or more of these products. These methods involve culturing the cell that has lactyl-CoA 

that allow the product to be produced. These cells also can have CoA synthetase activity 

and/or poly hydroxyacid synthase activity. 

Another aspect of the invention provides cells that have CoA synthetase activity, 

lactyl-CoA dehydratase activity, and poly hydroxyacid synthase activity. In some 
20 embodiments, these cells also can contain an exogenous nucleic acid molecule that 

encodes one or more of the following polypeptides: a polypeptide having El activator 

activity; an-E2 a polypeptide. that is a subunit of an enzyme .haying lactyL-CoA 

dehydratase activity; an E2 p polypeptide that is a subunit of an enzyme having lactyl- 

CoA dehydratase activity; a polypeptide having CoA synthetase activity; and a 
25 polypeptide having poly hydroxyacid synthase activity. 

In another embodiment of the invention, the cell that has CoA synthetase activity, 

lactyl-CoA dehydratase activity, and poly hydroxyacid synthase activity can produce a 

product, for example, polymerized acrylate. 

Another aspect of the invention provides a cell comprising CoA transferase 
30 activity, lactyl-CoA dehydratase activity, and lipase activity. In some embodiments, the 

cell also can contain an exogenous nucleic acid molecule that encodes one or more of the 

3 
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following polypeptides: a polypeptide having CoA transferase activity; a polypeptide 
having El activator activity; an E2 a polypeptide that is a subunit of an enzyme having 
lactyl-CoA dehydratase activity; an E2 0 polypeptide that is a subunit of an enzyme 
having lactyl-CoA dehydratase activity; and a polypeptide having lipase activity. This 
5 cell can be used, among other things, to produce products such as esters of acrylate (e.g., 
methyl acrylate, ethyl acrylate, propyl acrylate, and butyl acrylate). 

In some embodiments, 1,3 propanediol can be .created from either 3-HP-CoA or 3- 
HP via the use of polypeptides having enzymatic activity. These polypeptides can be 
used either in vitro or in vivo. When converting 3-HP-CoA to 1,3 propanediol, 
10 polypeptides having oxidoreductase activity or reductase activity (e.g., enzymes from the 
1.1.1.- class of enzymes) can be used. Alternatively, when creating 13 propanediol from 
3-HP, a combination of (1) a polypeptide having aldyhyde dehydrogenase activity (e.g., 
an enzyme from the 1 .1.1 .34 class) and (2) a polypeptide having alcohol dehydrogenase 

activity (e.g., an enzyme from the 1.1.132 ckss)^an_be us^ _• 

15 In some embodiments of the invention, products are produced in vitro (outside of 

a cell). In other embodiments of the invention, products are produced using a 
combination of in vitro and in vivo (within a cell) methods. In yet other embodiments of 
the invention, products are produced in vivo. For methods involving in vivo steps, the 
cells can be isolated cultured cells or whole organisms such as transgenic plants, non- 
20 human mammals, or single-celled organisms such as yeast and bacteria (e.g., 

Lactobacillus, Lactococcus, Bacillus, and Escherichia cells). Hereinafter such cells are 
referred to as production cells. Products produced by these production cells can be 
organic products such as 3-HP and/or the nucleic acid molecules and polypeptides 
described herein. 

25 Another aspect of the invention provides polypeptides having an amino acid 

sequence that (1) is set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161, 
(2) is at least 10 contiguous amino acid residues of a sequence set forth in SEQ ID NO:2, 
10, 18, 26, 35, 37, 39, 41, 141, 160, or 161, (3) has at least 65 percent sequence identity 
with at least 10 contiguous amino acid residues of a sequence set forth in SEQ ID NO:2, 

30 10, 1 8, 26, 35, 37, 39, 41, 141, 160, or 161, (4) is a sequence set forth in SEQ ID NO:2, 
10, 1 8, 26, 35, 37, 39, 41, 141, 160, or 161 having conservative amino acid substitutions, 
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or <5) has at least 65 percent sequence identity with a sequence set forth in SEQ ID NO:2, 
10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. Accordingly, the invention also provides 
nucleic acid sequences that encode any of the polypeptides described herein as well as 
specific binding agents that bind to any of the polypeptides described herein. Likewise, 
5 the invention provides transformed cells that contain any of the nucleic acid sequences 
that encode any of the polypeptides described herein. These cells can be used to produce 
nucleic acid molecules, polypeptides, and organic compounds. The polypeptides can be 
used to catalyze the formation of organic compounds or can be used as antigens to create 
specific binding agents. 

10 In yet another embodiment, the invention provides isolated nucleic acid molecules 

that contain at least one of the following nucleic acid sequences: (1) a nucleic acid 
sequence as set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 
1 62, or 163 ; (2) a nucleic acid sequence having at least 1 0 consecutive nucleotides from a 
sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 3.6, 38, 40, 42, 129, 140, 142, 162, 

1 5 or T63f (3) a nucfeicTici^ (e.g., 
moderately or highly stringent hybridization conditions) to a sequence set forth in SEQ 
ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163; (4) a nucleic acid 
sequence having 65 percent sequence identity with at least 10 consecutive nucleotides 
from a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 

20 T42, 162, or 163; and (5) a nucleic acid sequence having at least 65 percent sequence 

identity with a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 163. Accordingly, the invention also provides a production cell that 
contamsatieast one exogenousiiucleic^idiravmg^ 

provided above. The production cell can be used to express polypeptides that have an 
25 enzymatic activity such as Co A transferase activity, lactyl-CoA dehydratase activity, CoA 
synthase activity, dehydratase activity, dehydrogenase activity, malonyl CoA reductase 
activity, p-alanine ammonia lyase activity, and/or 3-hydroxypropionyl-CoA dehydratase 
activity. Accordingly, the invention also provides methods of producing polypeptides 
encoded by the nucleic acid sequences described above. 
30 The invention also provides several methods such as methods for making 3-HP 

from lactate, phosphoenolpyruvate (PEP), or pyruvate. In some embodiments, methods 
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for making 3-HP from lactate, PEP, or pyruvate involve culturing a cell containing at 
least one exogenous nucleic acid under conditions that allow the cell to produce 3-HP. 
These methods can he practiced using the various types of production cells described 
herein. In some embodiments, the production cells can have one or more of the following 
5 activities: CoA transferase activity, 3-hydroxypropionyl-CoA hydrolase activity, 3 r 
hydroxyisobutryl-CoA hydrolase activity, dehydratase activity, and/or malonyl CoA 
reductase activity. 

In other embodiments, the methods involve making 3-HP wherein lactate is 
contacted with a first polypeptide having CoA transferase activity or CoA synthetase 

10 activity such that lactyl-CoA is formed, then contacting lactyl-CoA with a second 

polypeptide having lactyl-CoA dehydratase activity to form acrylyl-CoA, then contacting 
acrylyl-CoA with a third polypeptide having 3-hydroxypropionyl-CoA dehydratase 
activity to form 3-hydroxypropionic acid-CoA, and then contacting 3-hydroxypropionic 
acid-CoA with the first polypeptide to form 3-HP or with a fourth polypeptide having 3- 

15 hydro^ypropionyi-CoA hy~droTase~^ hydrolase activity 

to form 3-HP. 

Another aspect of the invention provides methods for making polymerized 3-HP. 
These methods involve making 3-hydroxypropionic acid-CoA as described above, and 
then contacting the 3-hydroxypropionic acid-CoA with a polypeptide having poly 
20 hydroxyacid synthase activity to form polymerized 3-HP. 

In yet another embodiment of the invention, methods for making an ester of 3-HP 
are provided. These methods involve making 3-HP as described above, and then 
additionally contacting 3-HP with a fifth polypeptide having lipase activity to form an 
ester. 

25 The invention also provides methods for making polymerized acrylate. These 

methods involve culturing a cell that has both CoA synthetase activity, lactyl-CoA 
dehydratase activity, and poly hydroxyacid synthase activity such that polymerized 
acrylate is made. Accordingly, the invention also provides methods of making 
polymerized acrylate wherein lactate is contacted with a first polypeptide having CoA 

30 synthetase activity to form lactyl-CoA, then contacting lactyl-CoA with a second 
polypeptide having lactyl-CoA dehydratase activity to form acrylyl-CoA, and then 
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contacting acrylyl-CoA with a third polypeptide having poly hydroxyacid synthase 
activity to form polymerized acrylate. 

The invention also provides methods of making an ester of acrylate. These 
methods involve culturing a cell that has CoA transferase activity, lipase activity, and 

5 lactyl-Co A dehydratase activity under conditions that allow the ceil to produce an ester. 

In another embodiment, the invention provides methods for making an ester of 
acrylate, wherein acrylyl-CoA is formed as described above, and then acrylyl-CoA is 
contacted with a polypeptide having CoA transferase activity to form acrylate, and 
acrylate is contacted with a polypeptide having lipase activity to form the ester. 

10 The invention also provides methods for making 3-HP. These methods involve 

culturing a cell containing at least one exogenous nucleic acid that encodes at least one 
polypeptide such that 3-HP is produced from acetyl-CoA or malonyl-CoA. 

Alternative embodiments provide methods of making 3-HP, wherein acetyl-CoA 
is contacted with a first polypeptide having acetyl-CoA carboxylase activity to form 

1 5 malonyl-CoA, and malonyl-CoA is contacted with a second polypeptide having malonyl- 
CoA reductase activity to form 3-HP. 

In other embodiments, malonyl-CoA can be contacted with a polypeptide having 
malonyl-CoA reductase activity so that 3-HP can be made. 

In another embodiment, the invention provides a method for making 3-HP that 

20 uses a p-alanine intermediate. This method can be performed by contacting p-alanine 
CoA with a first polypeptide having p-alanyl-Co A ammonia lyase activity (such as a 
polypeptide having the amino acid sequence set forth in SEQ ID NO: 160 or 161) to form 
acrylyl-CoA, contacting acrylyl-CoA with a second polypeptide having 3-HP-CoA 
dehydratase activity to form 3-HP-CoA, and contacting 3-HP-CoA with a third 

25 polypeptide having glutamate dehydrogenase activity to make 3-HP. 

Unless otherwise defined, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 
invention pertains. Although methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present invention, suitable 

30 methods and materials are described below. All publications, patent applications, patents, 
and other references mentioned herein are incorporated by reference in their entirety. In 
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case of conflict, the present specification, including definitions, will control. In addition, 
the materials, methods, and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the 
following detailed description, and from the claims. 

5 

DESCRIPTION OF DRAWINGS 

Figure 1 is a diagram of a pathway for making 3-HP. 

Figure 2 is v a diagram of a pathway for making polymerized 3-HP. 

Figure 3 is a diagram of a pathway for making esters of 3-HP. 
10 Figure 4 is a diagram of a pathway for making polymerized acrylic add. 

Figure 5 is a diagram of a pathway for making esters of acrylate. 

JQgure_6.is a hsting of^nucleic acjd s^ having 
CoA transferase activity (SEQ ID NO:l). 

Figure 7 is a listing of an amino acid sequence of a polypeptide having CoA 
15 transferase activity (SEQ ID NO:2). 

Figure 8 is an alignment of the nucleic acid sequences set forth in SEQ ID NOs:l, 

3,4,and5. 

Figure 9 is an alignment of the amino acid sequences set forth in SEQ ID NOs:2, 
6, 7, and 8. 

20 Figure 10 is a listing of a nucleic acid sequence that encodes a polypeptide having 

El activator activity (SEQ ID NO:9). 

Figure 1 1 is a listing of an amino acid sequence of a polypeptide having El 
activator activity (SEQ ID NO:10). 

Figure 12 is an alignment of the nucleic acid sequences set forth in SEQ ID 
25 NOs:9, 11, 12, and 13. 

Figure 13 is an alignment of the amino acid sequences set forth in SEQ ID 

NOsrlO, 14, 15, and 16. 

Figure 14 is a listing of a nucleic acid sequence that encodes an E2 a subunit of an 
enzyme having lactyl-CoA dehydratase activity (SEQ ID NO:17). 
3 0 Figure 1 5 is a listing of an amino acid sequence of an E2 a subunit of an enzyme 

having lactyl-CoA dehydratase activity (SEQ ID NO:18). 
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Figure 16 is an alignment of the nucleic acid sequences set forth in SEQ ID 
NOs:17,19,20,and21. 

Figure 17 is an alignment of the amino acid sequences set forth in SEQ ID 
NOs:18,22,23,and24. 
5 Figure 1 8 is a listing of a nucleic acid sequence that encodes an E2 § subunit of an 

enzyme having lactyl-CoA dehydratase activity (SEQ ID NO:25). The "G" at position 
443 can be an "A"; and the a A" at position 571 can be a tP. 

Figure 19 is a listing of an amino acid sequence of an E2 P subunit of an enzyme 
having lactyl-CoA dehydratase activity <SEQ ID NO:26). 
1 0 Figure 20 is an alignment of the nucleic acid sequences set forth in SEQ ID 

NOs:25,27,28,and29. 

Figure 21 is an alignment of the amino acid sequences set forth in SEQ ID 
NOs:26,30,31,and32. < _ 

Figure 22 is a listing of a nucleic acid sequencejrf genomic DNA from 
15 Megasphaera elsdenii (SEQ ID NO:33). - 

Figure 23 is a listing of a nucleic acid sequence that encodes a polypeptide from 
Megasphaera elsdenii(SEQ ID NO:34). 

Figure 24 is a listing of an amino acid sequence of a polypeptide from 
Megasphaera elsdemiXSEQJD NO:35). 
20 Figure 25 is a listing of a nucleic acid sequence that encodes a polypeptide having 

enzymatic activity (SEQ ID NO:36). 

Figure 26 is a listing of an amino acid sequence of a polypeptide having 
enzymatic activity (SEQ4D NO:37). 

Figure 27 is a listing of a nucleic acid sequence that contains non-coding as well 
25 as coding sequence of a polypeptide having CoA synthase, dehydratase, and 

dehydrogenase activity (SEQ ID NO:38). The start site for the coding sequence is at 
position 480, a ribosome binding site is at position 466-473, and the stop codon is at 
position 5946. 

Figure 28 is a listing of an amino acid sequence from a polypeptide having CoA 
30 synthase, dehydratase, and dehydrogenase activity (SEQ ID NO:39). 
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I. 

Figure 29 is a listing of a nucleic acid sequence that encodes a polypeptide having 
3-hydroxypropionyl-CoA dehydratase activity {SEQ ID NO:40). 

Figure 30 is a listing of an amino acid sequence of a polypeptide having 3- 
hydroxypropionyl-CoA dehydratase activity (SEQ ID NO:41). 
5 Figure 3 1 is a listing of a nucleic acid sequence that contains non-coding as well 

as coding sequence of a polypeptide having 3-hydroxypropionyl-CoA dehydratase 
activity (SEQ ID NO:42). 

Figure 32 is an alignment of the nucleic acid sequences set forth in SEQ ID 

NOs:40,43,44,and45. 
10 Figure 33 is an alignment of the amino acid sequences set forth in SEQ ID 

NOs:41,46,47,and48. 

Figure 34 is a diagram of the construction of a synthetic operon (pTDH) that 

encodes for polypeptides having CoA transferase activity, lactyl-CoA dehydratase 

activity (El, E2 a, and E2 fl), and 3-hydroxypropionykCoA dehydratase activity (3-HP- 
15 CoA dehydratase). 

Figure 35A and B is a diagram of the construction of a synthetic operon (pHTD) 

that encodes for polypeptides having CoA transferase activity, lactyl-CoA dehydratase 

activity (El, E2 cu 3-hydroxypr opiony l- CoA dehyd ratase activity (3-HP- 

CoA dehydratase). 

20 Figure 3 6 A and B is a diagram of the construction of a synthetic operon 

(pEOTHrEI) that encodes for polypeptides having CoA transferase activity, lactyl-CoA 
dehydratase activity (El,JE2 a, and E2 p), and 3-hydroxypropionyl-CoA dehydratase 
activity (3-HP-CoA dehydratase). 

Figure 37A and B is a diagram of the construction of a synthetic operon 

25 (pEnTHtEI) that encodes for polypeptides having CoA transferase activity, lactyl-CoA 
dehydratase activity (El, E2 a, and E2 P), and 3-hydroxypropionyl-CoA dehydratase 
activity (3-fiP-CoA dehydratase). 

Figure 38A and B is a diagram of the construction of two plasmids, pEHTH and 
pPROEL The pEIITH plasmid encodes polypeptides having CoA transferase activity, 

30 lactyl-CoA dehydratase activity (E2 a and E2 p), and 3-hydroxypropionyl-CoA 
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dehydratase activity (3 -HP-Co A dehydratase), and the pPROEI plasmid encodes a 
polypeptide having E 1 activator activity. 

• Figure 39 is a listing of a nucleic acid sequence that encodes a polypeptide having 
CoA synthase, dehydratase, and dehydrogenase activity <SEQ ID NO;129). 
5 Figure 40 is an alignment of the amino acid sequences set forth in SEQ ID 

NOs:39, 130, and 131. The uppercase amino acid residues represent positions where that 
amino acid residue is present in two or more sequences. 

Figure 41 is an alignment of the amino acid sequences set forth in SEQ ID 
NOs:39, 132, and 133. The uppercase amino acid residues represent positions where that 
10 amino acid residue is present in two or more sequences. 

Figure 42 is an alignment of the amino acid sequences set forth in SEQ ID NOs: 
39, 134, and 135. The uppercase amino acid residues represent positions where that 
amino acid residue is present in two or more sequences. 

Figure 43 is a diagram of several pathways for making organic compounds using 

15 the multifunctional GSH-enzyme. - — — — — 

Figure 44 is a diagram of a pathway for making 3 -HP via acetyl-CoA and 
malonyl-CoA. 

Figure 45 is a diagram of pMSD8, pET30a/accl, pFN476, and PET286 constructs. 
Figure. 46 cqi^ mass spectrums of 

20 Coenzyme-A thioesters. -Panel A is total-ion chromatogram iUustratingihe separation of 
Coenzyme A and four CoA-organic thioesters: l=Coenzyme A, 2 = =lactyl-CoA, 3=acetyl- 
CoA, 4=acrylyl-CoA, 5=propionyl-CoA. Panel B is a mass spectrum of Coenzyme A. 
Panel C is a mass spectrum of lactyl-CoA. Panel D is a mass spectrum of acetyl-CoA. 
Panel E is a mass spectrum of acrylyl-CoA. Panel F is a mass spectrum of propionyl- 
25 CoA. 

Figure 47 contains ion chromatograms and mass spectrums. Panel A is a total ion 
chromatogram of a mixture of lactyl-CoA and 3-HP-CoA. The Panel A insert is the mass 
spectrum recorded under peak 1 . Panel B is a total ion chromatogram of lactyl-CoA. The 
Panel B insert is the mass spectrum recorded under peak 2. In each panel, peak 1 is 3- 
30 HP-CoA, and peak 2 is lactyl-Co A. The peak labeled with an asterisk was confirmed not 
to be a CoA ester. 
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Figure 48 contains ion chromatograms and mass spectrums. Panel A is a total ion 
chromatogram of CoA esters derived from a broth produced by E. coli transfected with 
pEHTHrEI. The Panel A insert is the mass spectrum recorded under peak 1. Panel B is a 
total ion chromatogram of CoA esters derived from a broth produced by control E. coli 
5 not transfected with pEIITHrEI. The Panel B insert is the mass spectrum recorded under 
peak 2. In each panel, peak 1 is 3-HP-CoA and peak 2 is lactyl-CoA The peaks labeled 
with an asterisk were confirmed not to be a CoA ester. 

Figure 49 is a listing of a nucleic acid sequence that encodes a polypeptide having 
malonyl-CoA reductase activity (SEQ ID NO: 140). 
10 Figure 50 is a listing of an amino acid sequence of a polypeptide having malonyl- 

CoA reductase activity (SEQ ID NO:141). 

Figure 5 1 is a listing of a nucleic acid sequence that encodes a portion of a 
polypeptide having malonyl-CoA reductase activity (SEQ ID NO:142). 

- -Figure 52 is an alignment of the amino acid sequences set forth in SEQ ID NOs: 

15 141, 143, 144, 145, 146, and 147. 

Figure 53 is an alignment of the nucleic acid sequences set form in SEQ ID NOs: 
140, 148, 149, 150, 151, and 152. 

Figure 54 is a diagram of a pathway for making 3-HP via a ^alanine intermediate. 
Figure 55 is a diagram of a pathway formaking3-HP via a p^alanineintermediate. 
20 Figure 56 is a listing of an amino acid sequence of a polypeptide having p-alanyl- 

CoA ammonia lyase activity (SEQ ID NO:160). 

-Figure 57 is a listing of an amino acid sequence of a polypeptide having P-alanyl- 
CoA ammonia lyase activity (SEQ ID NO:161). 

Figure 58 is a listing of a nucleic acid sequence that encodes a polypeptide having 
25 p-alanyl-CoA ammonia lyase activity (SEQ ID NO:162). 

Figure 59 is a listing of a nucleic acid sequence that can encode a polypeptide 
having p-alanyl-CoA ammonia lyase activity (SEQ ID NO:163). 
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DETAILED DESCRIPTION 

I. Terms 

Nucleic acid: The term "nucleic acid" as used herein encompasses both RNA and 
DNA including, without limitation, cDNA, genomic DNA, and synthetic (e.g., chemically 

5 synthesized) DNA. The nucleic acid can be double-stranded or single-stranded. Where 
single-stranded, the nucleic acid can be the sense strand or the antisense strand. In 
addition, nucleic acid can be circular or linear. 

Isolated; The term "isolated" as used herein with reference to nucleic acid refers 
to a naturally-occurring nucleic acid that is not immediately contiguous with both of the 

10 sequences with which it is immediately contiguous (one on the 5' end and one on the 3' 
end) in the naturally-occurring genome of the organism from which it is derived. For 
example, an isolated nucleic acid can be, without limitation, a recombinant DNA 
molecule of any length, provided one of the nucleic acid sequences normally found 
immediately flanking that recombinant DNA molecule in a naturally-occurring genome is 

15 removed or absent Thus, an isolated nucleic acid includes, without limitation, a 

recombinant DNA that exists as a separate molecule (e.g., a cDNA or a genomic DNA 
fragment produced by PCR or restriction endonuclease treatment) independent of other 
sequences as well as recombinant DNA that is incorporated into a vector, an 
autonomously replicating plasmid, a virus (e.g., a retrovirus, adenovirus, or herpes virus), 

20 or into the genomic DNA of a prokaryote or eukaryote. In addition, an isolated nucleic 
acid can include a recombinant DNA molecule that is part of a hybrid or fusion nucleic 
acid sequence. 

The term "isolated" as used herein with reference to nucleic acid also includes any 
non-naturally-occurring nucleic acid since non-naturally-occurring nucleic acid sequences 

25 are not found in nature and do not have immediately contiguous sequences in a naturally- 
occurring genome. For example, non-naturally-occurring nucleic acid such as an 
engineered nucleic acid is considered to be isolated nucleic acid. Engineered nucleic acid 
can be made using common molecular cloning or chemical nucleic acid synthesis 
techniques. Isolated non-naturally-occurring nucleic acid can be independent of other 

30 sequences, or incorporated into a vector, an autonomously replicating plasmid, a virus 
(e.g., a retrovirus, adenovirus, or herpes virus), or the genomic DNA of a prokaryote or 
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eukaryote. In addition, a non-naturaliy-occurring nucleic acid can include a nucleic acid 
molecule that is part of a hybrid or fusion nucleic acid sequence. 

It will be apparent to those of skill in the art that a nucleic acid existing among 
hundreds to millions of other nucleic acid molecules within, for example, cDNA or 

5 genomic libraries, or gel slices containing a genomic DNA restriction digest is not to be 
considered an isolated nucleic acid. 

Exogenous: The term "exogenous" as used herein with reference to nucleic acid 
and a particular ceil refers to any nucleic acid that does not originate from that particular 
cell as found in nature. Thus, non-naturally-occurring nucleic acid is considered to be 

10 exogenous to a cell once introduced into the cell. Nucleic acid that is naturally-occurring 
also can be exogenous to a particular cell. For example, an entire chromosome isolated 
from a cell of person X is an exogenous nucleic acid with respect to a cell of person Y 
once that chromosome is introduced into Y's cell. mm 
Hybridization: The term "hybridization" as used herein refers to a method of 

1 5 testing for complementarity in the nucleotide sequence of two nucleic acid molecules, 
based on the ability of complementary single-stranded DNA and/or RNA to form a 
duplex molecule. Nucleic acid hybridization techniques can be used to obtain an isolated 
nucleic acid within the scope of the invention. Briefly, any nucleic acid having some 

homolpg^ 42 ' 129 ' 

20 140, 142, 162, or 163 can be used as a probe to identify a similar nucleic acid by 
hybridization under conditions of moderate to high stringency. Once identified, the 
nucleic acid then can be purified, sequenced, and analyzed to determine whether it is 
within the scope of the invention as described herein. 

Hybridization can be done by Southern or Northern analysis to identify a DNA or 
25 RNA sequence, respectively, that hybridizes to a probe. The probe can be labeled with a 
biotin, digoxygenin, an enzyme, or a radioisotope such as 32 P. The DNA or RNA to be 
analyzed can be electrophoietically separated on an agarose or polyacrylamide gel, 
transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the 
probe using standard techniques well known in the art such as those described in sections 
30 7.39-7.52 of Sambrook et al., (1 989) Molecular Cloning, second edition, Cold Spring 
Harbor Laboratory, Plainview, NY. Typically, a probe is at least about 20 nucleotides in 
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length. For example, a probe corresponding to a 20 nucleotide sequence set forth in SEQ 
ID NO: 1, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, or 142 can be used to identify an 
identical or similar nucleic acid. In addition, probes longer or shorter than 20 nucleotides 
can be used. 

5 The invention also provides isolated nucleic acid sequences that are at least about 

12 bases in length (e.g., at least about 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 60, 
100, 250, 500, 750, 1000, 1500, 2000, 3000, 4000, or 5000 bases in length) and hybridize, 
under hybridization conditions, to the sense or antisense strand of a nucleic acid having 
the sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 

1 0 162, or 163. The hybridization conditions can be moderately or highly stringent 
hybridization conditions. 

For the purpose of this invention, moderately stringent hybridization conditions 
mean the hybridization is performed at about 42°Gin a hybridization solution containing 
25 raM KP0 4 (pH 7.4), 5X SSC, 5X Denhait's solution, 50 ng/mL denatured, sonicated 

1 5 salmon sperm DNA, 50% formamide, 1 0% Dextran sulfate, and 1-15 ng/mL probe {about 
5xl0 7 cpm/jig), while the washes are performed at about 50°C with a wash solution 
containing 2X SSC and 0.1% sodium dodecyl sulfate. 

Highly stringent hybridization conditions mean the hybridization is performed at 
about 42°C in a hybridization solution containing 25 mM KPO4 (pH 7.4), 5X SSC, 5X 

20 Denhart's solution, 50 jxg/mL denatured, sonicated salmon sperm DNA, 50% formamide, 
10% Dextran sulfate, and 1-15 ng/mL probe (about 5xl0 7 cpm/jig), while the washes are 
performed at about 65°C with a wash solution containing 0.2X SSC and 0.1% sodium 
dodecyl sulfate. 

Purified: The term "purified" as used herein does not require absolute purity; 

25 rather, it is intended as a relative term. Thus, for example, a purified polypeptide or 
nucleic acid preparation can be one in which the subject polypeptide or nucleic acid, 
respectively, is at a higher concentration than the polypeptide or nucleic acid would be in 
its natural environment within an organism. For example, a polypeptide preparation can 
be considered purified if the polypeptide content in the preparation represents at least 

30 50%, 60%, 70%, 80%, 85%, 90%, 92%, 95%, 98%, or 99% of the total protein content of 
the preparation. 

15 



BNSDOCID: <WO 024241 8A2_I_> 



WO 02/42418 



PCT/US01/43607 



Transformed: A "transformed" cell is a cell into which a nucleic acid molecule 
has been introduced by, for example, molecular biology techniques. As usedherein, the 
term "transformation" encompasses all techniques by which a nucleic acid molecule 
might be introduced into such a cell including, without limitation, transfection with a viral 
5 vector, conjugation, transformation with a plasmid vector, and introduction of naked 
DNA by electroporation, lipofection, and particle gun acceleration. 

Recombmant: A-^recombinant" nucleic acid is one having (1) a sequence that is 

not naturally occurring in the organism in which it is expressed or (2) a sequence made by 
an artificial combination of two otherwise-separated, shorter sequences. This artificial 
10 combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering 
techniques. "Recombinant* is also-used to -describeWeic^tid-molecules that have been 
artificially manipulated, but contain the same regulatory sequences and coding regions 

that areibund-inthe organism fi^ ariiichJiftJMcl ei c a cid was isolate d 

15 Specific binding agent: A "specific binding agent" is an agent that is capable of 

specifically binding to any of the polypeptide described herein, and can include 
polyclonal antibodies, monoclonal antibodies (including humanized monoclonal 
antibodies), and fragments of monoclonal antibodies such as Fab, Frab'fc, and Fv 
fragments^well as-ai^)theragentx 

20 polypeptides. - 

Antibodies to the polypeptides provided herein (or fragments thereof) can be used 

to purify or Jdentify.sj^ 

provided herein allow for the production of specific antibody-based binding agents that 
recognize me polypeptides described herein. 

25 Monoclonal or polyclonal antibodies can be produced to the polypeptides, 

portions of the polypeptides, or variants thereof. Optimally, antibodies raised against one 
or more epitopes on a polypeptide antigen will specifically detect that polypeptide. That 
is, antibodies raised against one particular polypeptide would recognize and bind that 
particular polypeptide, and would not substantially recognize or bind to other 

30 polypeptides. The determination that an antibody specifically binds to a particular 
polypeptide is made by any one of a number ofstandard immunoassay methods; for 
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instance, Western blotting<See, e.g., Sambrook et al. {ed.) 9 Molecular Cloning: A 
Laboratory Manual, 2nded., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NX, 1989). 

To determine that a given antibody preparation (such as a preparation produced in 

5 a mouse against a polypeptide having the amino acid sequence set forth in SEQ ID NO: 
2) specifically detects the appropriate polypeptide {e.g., a polypeptide having the amino 
acid sequence set forth in SEQ ID NO: 2) by Western blotting, total cellular protein can 
be extracted from cells and separated by SDS-polyacrylamide gel electrophoresis. The 
separated total cellular protein can then be transferred to a membrane (e.g., 

10 nitrocellulose), and the antibody preparation incubated with the membrane. After 
washing the membrane to remove non-specifically bound antibodies, the presence of 
specifically bound antibodies can be detected using an appropriate secondary antibody 
{e.g., an anti-mouse antibody) conjugated to an enzyme such as alkaline phosphatase 
since application of 5 -bromo-4-chloro-3 -indoly 1 phosphate/nitro blue tetrazolium results 

15 in the production of a densely Wue-cplored impound byjnm alkaline 
phosphatase. 

Substantially pure polypeptides suitable for use as an immunogen can be obtained 
from transfected cells, transformed cells, or wild-type cells. Polypeptide concentrations 
in the final preparation can be adjusted, for example, by concentration on an Amicon 

20 filter device, to the level of a few micrograms per milliliter. In addition, polypeptides 

ranging in size from full-length polypeptides to polypeptides having as few as nine amino 
acid residues can be utilized as immunogens. Such polypeptides can be produced in cell 
culture, can be chemically synthesized using standard methods, or can be obtained by 
cleaving large polypeptides into smaller polypeptides that can be purified. Polypeptides 

25 having as few as nine amino acid residues in length can be immunogenic when presented 
to an immune system in the context of a Major Histocompatibility Complex (MHC) 
molecule such as an MHC class I or MHC class II molecule. Accordingly, polypeptides 
having at least 9, 10, 1 1, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 
150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 900, 1000, 1050, 

30 1 100, 1 150, 1200, 1250, 1300, 1350, or more consecutive amino acid residues of any 
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amino acid sequence disclosed herein can be used as immunogens for producing 
antibodies. 

Monoclonal antibodies to any of the polypeptides disclosed herein can be 
prepared from murine hybridomas according to the classic method of Kohler & Milstein 
5 (Nature 256:495 (1 975)) or a derivative method thereof. 

Polyclonal antiserum containing antibodies to the heterogeneous epitopes of any 
polypeptide disclosed 'hu^^JS^^^^ suitable animals with the 
polypeptide (or fragment thereof), which can be unmodified or modified to enhance 
immunogenicity. An effective immunization protocol for rabbits can be found in 

10 Vaitukaitisero/.(J.C/^ 

Antibody fragments can be used in place of whole antibodies and can be readily 

evp^yij m prnl^yotic host cells,. Me ttodAQf mgjgnM^J*inj ^Mpkgg^y 
effective portions of monoclonal antibodies, also referred to as "antibody fragments," are 
well known and include t hose described in Better & Horowitz: (Methods Emymol. 
15 178:476-496 (1989)), Glockshuber et d. (Biochemistry 29:1362-1367 (1990), U.S. Pat. 
No. 5,648,237 ("Expression of Functional Antibody Fragments"), U.S. Pat No. 4,946,778 
("Single Polypeptide Chain Binding Molecules"), U.S. Pat No. 5,455,030 
("Immunotherapy Using Single Chain Polypeptide Binding Molecules"), and references 

cited therein. 

20 Operably linked: A first nucleic acid sequence is "operably linked" with a 

second nucleic acid sequence whenever the first nucleic acid sequence is placed in a 
functional relationship with the second nucleic acid sequence. For instance, a promoter is 
operably linked to a coding sequence if the promoter affects the transcription of the 
coding sequence. Generally, operably linked DNA sequences are contiguous and, where 
25 necessary to join two polypeptide-coding regions, in the same reading frame. 

Probes and primers: Nucleic acid probes and primers can be prepared readily 
based on the amino acid sequences and nucleic acid sequences provided herein. A 
"probe" includes an isolated nucleic acid containing a detectable label or reporter 
molecule. Typical labels include radioactive isotopes, ligands, cheinuuminescent agents, 
30 andenzymes. Methods for labeling and guidancein the choice of labels appropriate for 
various purposes are discussed in, for example, Sambrook et a/,<ed.), Molecular Cloning: 
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A Laboratory Manual 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, N.Y., 1989, and Ausubel et al (ed.) Current Protocols in Molecular 
Biology, Greene Publishing and Wiley-Interscience, New York (with periodic updates), 
1987. 

5 "Primers" are typically nucleic acid molecules having ten or more nucleotides 

{e.g., nucleic acid molecules having between about 10 nucleotides and about 100 
nucleotides). A primer can be annealed to a complementary target nucleic acid strand by 
nucleic acid hybridization to form a hybrid between the primer and the target nucleic acid 
strand, and then extended along the target nucleic acid strand by, for example, a DNA 
1 0 polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 

sequence, for example, by the polymerase chain reaction (PCR) or other nucleic-acid 
amplification methods known in the art 

Methods for preparing and using probes and primers are described, for example, 
in references such as Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 
15 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989; 
Ausubel et al (ed.), Current Protocols in Molecular Biology, Greene Publishing and 
Wiley-Interscience, New York (with periodic updates), 1987; and Innis et a/., PCR 
' Protocols: A Guide to Methods and Applications, Academic Press: San Diego, 1990. PCR 
primer pairs can be derived from a known sequence, for example, by using computer 
20 programs intended for that purpose such as Primer (Version 0.5, .COPYRGT. 1991, 

Whitehead Institute for Biomedical Research, Cambridge, Mass.). One of skill in the art 
will appreciate that the specificity of a particular probe or primer increases with the 
length, but that a probe or prima: can range in size from a full-length sequence to 
sequences as short as five consecutive nucleotides. Thus, for example, a primer of 20 
25 consecutive nucleotides can anneal to a target with a higher specificity than a 

corresponding primer of only 15 nucleotides. Thus, in order to obtain greater specificity, 
probes and primers can be selected that comprise, for example, 10, 20, 25, 30, 35, 40, 50, 
60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 
850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 
30 1600, 1650, 1700, 1750, 1800, 1850, 1900, 2000, 2050, 2100, 2150, 2200, 2250, 2300, 
. ■ 2350, 2400, 2450, 2500, 2550, 2600, 2650, 2700, 2750, 2800, 2850, 2900, 3000, 3050, 
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3100, 3150, 3200, 3250, 3300, 3350, 3400, 3450, 3500, 3550, 3600, 3650, 3700, 3750, 
3800, 3850, 3900, 4000, 4050, 4100, 4150, 4200, 4250, 4300, 4350, 4400, 4450, 4500, 
4550, 4600, 4650, 4700, 4750, 4800, 4850, 4900, 5000, 5050, 5100, 5150,5200, 5250, 
5300, 5350, 5400, 5450, or more consecutive nucleotides. 
5 Percent sequence identity: The "percent sequence identity" between a particular 

nucleic acid or amino acid sequence and a sequence referenced by a particular sequence 
identification number is determined as follows. First, a nucleic acid or amino acid - 
sequence is compared to the sequence set forth in a particular sequence identification 
number using the BLAST 2 Sequences (B12seq) program from the stand-alone version of 
10 BLASTZ containing BLASTN version 2.0,14 and BLASTP version 2.0,14. This stand- 
alone version of BLASTZ can be obtained from Fish & Richardson's web site 
(www.frxom) or the United States government's National-Center '-for Biotechnology 
Information web site (www.ncbi.nlm.nih.gov). Instructions explaining how to use the 
~B12seq^rogramxmbe^ B12seq 
1 5 performs a comparison between two sequences using either the BLASTN or BLASTP 
algorithm. BLASTN is used to compare nucleic acid sequences, while BLASTP is used 
to compare amino acid sequences. To compare two nucleic acid sequences, the options 
are set as follows: -i is set to a file containing the first nucleic acid sequence to be 
xompared-(e.g., C:\seql .txt); -j is set to a file containing the second nucleic acid sequence 
20 to-be compared .(e.g., C:\seq2.txt); -p is set to blastn; -o is set to any desired file name 
(e.g., C:\output.txt); -q is set to -1 ; -r is set to 2; and all other options are left at their 
default setting. For example, the following command can be used to generate an output 
file containing a comparison between two sequences: C:\B12seq -4 c:\seql .txt 
c:\seq2.txt -p blastn -o c:\outputtxt -q -1 -r 2. To compare two amino acid sequences, 
25 the options of B12seq are set as follows: -i is set to a file containing the first amino acid 
sequence to be compared (e.g., C:\seql .txt); -j is set to a file containing the second amino 
acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set to any desired 
file name (e.g., C:\outputtxt); and all other options are left at their default setting. For 
example, the following command can be used to generate an output file containing a 
30 comparison between two amino acid sequences: C:\B12seq -i c:\seql .txt -j c:\seq2.txt -p 
blastp -o c:\outputtxt. If the two compared sequences share homology, then the 
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designated output file will present those regions of homology as aligned sequences. If the 
two compared sequences do not share homology, then the designated output file will not 
present aligned sequences. 

Once aligned, the number of matches is determined byeounting the number of 

5 positions where an identical nucleotide or amino acid residue is presented in both 
sequences. The percent sequence identity is determined by dividing the number of 
matches either by the length of the sequence set forth in the identified sequence (e.g., 
SEQ ID NO:l), or by an articulated length {e.g., 100 consecutive nucleotides or amino 
acid residues from a sequence set forth in an identified sequence), followed by 

10 multiplying the resulting value by 100. For example, a nucleic acid sequence that has 
1 1 66 matches when aligned with the sequence set forth in SEQ ID NO:l is 75.0 percent 
identical to the sequence set forth in SEQ ID NO:l (i.e., 1166+1554*100=75.0). It is 
noted that the percent sequence identity value is rounded to the nearest tenth. For 
example, 75.11, 75.12, 75.13, and 75.14 is rounded dovvn to 75.1, while 75.15, 75.16, 

15 75.17, 75.18, and 75.19 is rounded up to 75.2. It is also noted that the length value will 
always be an integer. In another example, a target sequence containing a 20-nucleptide 
•region that aligns with 20 consecutive nucleotides from an identified sequence as follows 
contains a region that shares 75 percent sequence identity to that identified sequence (i.e., 
15-20*100=75).__ _ 

20 1 20 

Target Sequence: AGGTCGTGTACTGTCAGTCA 

I I I I I I I I I I I II I I 
Identified Sequence.! AGGTGGTGAACTGCCAGTCA 

25 Conservative substitution: The term "conservative substitution" as used herein 

refers to any of the amino acid substitutions set forth in Table 1. Typically, conservative 
substitutions have little to no impact on the activity of a polypeptide. A polypeptide can 
be produced to contain one or more conservative substitutions by manipulating the 
nucleotide sequence that encodes that polypeptide using, for example, standard 

30 procedures such as site-directed mutagenesis or PCR. 
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Table 1 



Original 
Residue 


Conservative 
Substitution(s) 


Ala 


ser 


Arc 


lys 


Asn 


gin; his 


Asp 


glu 


Cys 


ser 


Gin 


asn 


Glu 


asp 


Glv 


pro 


; His 


"asnfeln 


Ue 


leu; val 


— —~ — — — teu 


ile; val — ■ 


Lvs 


are: elm elu 


Met 


leu; ile 


Phe 


met; leu; tyr 


Ser 


thr 


- — Thr " 


ser 


Trp 


tyr 


Tyr 


trp; phe 


- — Val 


ile; leu 



IL Metabolic Pathways 

The invention provides methods and materials related to producing 3-HP as well 
5 as other organic compounds (e.g., 1 ,3-propanediol, aciylic acid, polymerized acrylate, 
esters of acrylate, polymerized 3-HP, and esters of 3-HP). Specifically, the invention 
provides isolated nucleic acids, polypeptides, host cells, and methods and materials for 
producing 3-HP as well as other organic compounds such as 1,3 -propanediol, acrylic 
acid, polymerized acrylate, esters of acrylate, polymerized 3-HP, and esters of 3-HP. 
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Accordingly, the invention provides several metabolic pathways that can be used 
to produce organic compounds from PEP (Figures 1-5, 43-44, 54, and 55). As depicted in 
Figure 1 , lactate can be converted into lactyl-Co A by a polypeptide having CoA 
transferase activity (EC 2.8.3.1); the resulting lactyl-CoA can be converted into acrylyl- 

5 CoA by a polypeptide (or multiple polypeptide complex such as an activated E2 a and E2 
p complex) having lactyl-Co A dehydratase activity (EC 4.2.1.54); the resulting acrylyl- 
CoA can be converted into 3-hydroxypropionyl-CoA (3-HP-CoA) by a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity (EC 4.2.1.-); and the resulting 3- 
HP-CoA can be converted into 3-HP by a polypeptide having CoA transferase activity, a 

10 polypeptide having 3-hydroxypropionyl-CoA hydrolase activity (EC 3.12.-), or a 
polypeptide having 3-hydroxyisobutryl-CoA hydrolase activity (EC 3.12.4). 

Polypeptides having CoA transferase activity as well as nucleic acid encoding 
such polypeptides can be obtained from various spgcies including, without limitation, 
Megasphaera elsdenii, Clostridium propionicum, Clostridium kluyverU and Escherichia 

15 coli. For example, nucleic acid that encodes a polypeptide having CoA transferase 

activity can be obtained from Megasphaera elsdenii as described in Example 1 and can 
have a sequence as set forth in SEQ ID NO: 1 . In addition, polypeptides having CoA 
transferase activity as well as nucleic acid encoding such polypeptides can be obtained as 
described herein. For example, the variations to SEQ ID NO: 1 provided herein can be 

20 used to encode a polypeptide having CoA transferase activity. 

Polypeptides (or the polypeptides of a multiple polypeptide complex such as an 
activated E2 a and E2 P complex) having lactyl-CoA dehydratase activity as well as 
nucleic acid encoding such polypeptides can be obtained from various species including, 
without limitation, Megasphaera elsdenii and Clostridium propionicum. For example, 

25 nucleic acid encoding an El activator, an E2 a subunit, and an E2 {$ subunit that can form 
a multiple polypeptide complex having lactyl-CoA dehydratase activity can be obtained 
from Megasphaera elsdenii as described in Example 2. The nucleic acid encoding the El 
activator can contain a sequence as set forth in SEQ ID NO: 9; the nucleic acid encoding 
the E2 a subunit can contain a sequence as set forth in SEQ ID NO: 17; and the nucleic 

30 acid encoding theE2 {J subunit can contain a sequence as set forth in SEQ ID NO: 25. In 
addition, polypeptides (or the polypeptides of a multiple polypeptide complex) having 
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lactyl-Co A dehydratase activity as well as nucleic acid encoding such polypeptides can be 
obtained as described herein. For example, the variations to SEQ ID NO: 9, 17, and 25 
provided herein can be used to encode the polypeptides of a multiple polypeptide 
complex having Co A transferase activity. 
5 Polypeptides having 3-hydroxypropionyl-CoA dehydratase activity as well as 

nucleic acid encoding such polypeptides can be obtained from various species including, 
without limitation,-CWorq/Zexw5 aurantiacus, Candida rugosa, Rhodosprillium rubrum, 
and Rhodobacter capsulates. For example, nucleic acid that encodes a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity can be obtained from Chlorqflexus 
10 aurantiacus as described in Example 3 and can have a sequence as set forth in SEQ ID 
NO: 40. In addition, polypeptides having 3-hydroxypropionyl-CoA dehydratase activity 
as well as nucleic acid encoding such polypeptides can be obtained as-described herein. 
For example, the variations to SEQ ID NO: 40 provided herein can be used to encode a 
pol>qpeptideJia\ring-3A^ a ct i v ity. 
1 5 Polypeptides having 3-hydroxypropionyl-CoA hydrolase activity as well as 

nucleic acid encoding such polypeptides can be obtained from various species including, 
without limitation, Candida rugosa. Polypeptides having 3-hydroxyisobutryl-CoA 
hydrolase activity as well as nucleic acid encoding such polypeptides can be obtained 
frnm vari ous speciesincluding^withQu r limit a tiQn , Pseu dama n as flu ore s cens+rattus, and 
20 homo sapiens. For example, nucleic acid that encodes a polypeptide having 3- 

hydroxyisobutryl-CoA hydrolase activity can be obtained from homo sapiens and can 
have a sequence as set forth in GenBank® accession number U66669. 

The term "polypeptide having enzymatic activity" as used herein refers to any 
polypeptide that catalyzes a chemical reaction of other substances without itself being 
25 destroyed or altered upon completion of the reaction. Typically, a polypeptide having 
enzymatic activity catalyzes the formation of one or more products from one or more 
substrates. Such polypeptides can have any type of enzymatic activity including, without 
limitation, the enzymatic activity or enzymatic activities associated with enzymes such as 
dehydratases/hydratases, 3-hydroxypropionyl-CoA dehydratases/hydratases, CoA 
30 transferases, lactyl-CoA dehydratases, 3-hydroxypropionyl-CoA hydrolases, 3- 
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hydroxyisobutryl-CoA hydrolases, poly hydroxyacid synthases, CoA synthetases, 
malonyl-CoA reductases, p-alanine ammonia lyases, and lipases. 

As depicted in Figure 2, lactate can be converted into lactyl-CoA bya polypeptide 
having CoA synthetase activity (EC 6.2.1.-); the resulting lactyl-CoA can be converted 

5 into acrylyl-CoA by a polypeptide (or multiple polypeptide complex) having lactyl-CoA 
dehydratase activity; the resulting acrylyl-CoA can be converted into 3-HP-CoA by a 
polypeptide having 3-hydroxypropionyl-CoA dehydratase activity; and the resulting 3- 
HP-CoA can be converted into polymerized 3-HP by a polypeptide having poly 
hydroxyacid synthase activity (EC 2.3.1.-). Polypeptides having CoA synthetase activity 

10 as well as nucleic acid encoding such polypeptides can be obtained from various species 
including, without limitation, Escherichia coli, Rhodobacter sphaeroides, Saccharomyces 
cervisiae, and Salmonella enterica. For example, nucleic acid that encodes a polypeptide 
having CoA synthetase activity can be obtained from Escherichia coli and can have a 
sequence as set forth in GenBank® accession number U00006. Polypeptides (or multiple 

15 polypeptide complexes) havingiactyl-CoA dehydratase activityas well as nucleic acid 
encoding such polypeptides can be obtained as provided herein. Polypeptides having 3- 
hydroxypropionyl-CoA dehydratase activity as well as nucleic acid encoding such 
polypeptides also can be obtained as provided herein. Polypeptides having poly 
hydroxyacid synthase activity as well as nucleic acid encoding such polypeptides can be 

20 obtained from various species including, without limitation, Rhodobacter sphaeroides, 
Comamonas acidororans, Ralstonia eutropha, and Pseudomonas oleovorans. For 
example, nucleic acid that encodes a polypeptide having poly hydroxyacid synthase 
activity can be obtained from Rhodobacter sphaeroides and can have a sequence as set 
forth in GenBank® accession number X97200. 

25 As depicted in Figure 3, lactate can be converted into lactyl-CoA by a polypeptide 

having CoA transferase activity; the resulting lactyl-CoAcan be converted into acrylyl- 
CoA by a polypeptide (or multiple polypeptide complex) having lactyl-CoA dehydratase 
activity; the resulting acrylyl-CoA can be converted into 3-HP-CoA by a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity; the resulting 3 -HP-Co A can be 

30 converted into 3-HP by a polypeptide having CoA transferase activity, a polypeptide 
having 3-hydroxypropionyl-CoA hydrolase activity, or a polypeptide having 3- 
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hydroxyisobufryl-CoA hydrolase activity; and the resulting 3-HP can be-converted into an 
ester of 3-HP by a polypeptide having lipase activity (EC 3.1.1.-). Polypeptides having 
lipase activity as well as nucleic acid encoding such polypeptides can be obtained from 
various species including, without limitation, Candida rugosa, Candida tropicalis, and 
5 Candida albicans. For example, nucleic acid that encodes a polypeptide having lipase 
activity can be obtained from Candida rugosa and can have a sequence as set forth in 
GenBank® accession number A81 171. 

As depicted in Figure 4, lactate can be converted into lactyl-CoA by a polypeptide 
having CoA synthetase activity; the resulting lactyl-CoA can be converted into acrylyl- 
10 CoA by a polypeptide (or multiple polypeptide complex) having lactyl-CoA dehydratase 
activity; and the resulting acrylyl-CoA can be converted into polymerized acrylate by a 
polypeptide having poly hydroxyadd synthase activity. 

As depicted in Figure 5, lactate can be converted into lactyl-CoA by a polypeptide 
having CoA transferase activity; the resulting lactyY-CoA can be converted into acrylyl- 
15 CoA by a polypeptide (or multiple polypeptide complex) having lactyl-CoA dehydratase 
activity; the resulting acrylyl-CoA can be converted into acrylate by a polypeptide having 
CoA transferase activity; and the resulting acrylate can be converted into an ester of 
acrylate by a polypeptide having lipase activity. 

As depicted in Figure 44, acetyl-CoA can be converted into malonyl-CoA by a 
20 polypeptide having acetyl-CoA carboxylase activity, and the resulting malonyl-CoA can 
be converted into 3-HP by a polypeptide having malonyl-CoA reductase activity. 
Polypeptides having acetyl-CoA carboxylase activity as well as nucleic acid encoding 
such polypeptides can be obtained from various species mcluding, without limitation, 
Escherichia coli and Chlorqflexus aurantiacus. For example, nucleic acid mat encodes a 
25 polypeptide having acetyl-CoA carboxylase activity can be obtained from Escherichia 
coli and can have a sequence as set forth in GenBank® accession number M96394 or 
U18997. Polypeptides having malonyl-CoA reductase activity as well as nucleic acid 
encoding such polypeptides can be obtained from various species including, without 
limitation, Chlorqflexus aurantiacus, Sulfolobus metacillus, and Acidianus brierleyi. For 
30 example, nucleic acid that encodes a polypeptide having malonyl-CoA reductase activity 
can be obtained as described herein and can have a sequence similar to the sequence set 
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forth in SEQ ID NO: 140. In addition, polypeptides having malonyl-CoA reductase 
activity as well as nucleic acid encoding such polypeptides can be obtained as described 
herein- For example, the variations to SEQ ID NO: 140 provided herein can be used to 
encode a polypeptide having malonyl-CoA reductase activity. 

5 Polypeptides having malonyl-CoA reductase activity can use NADPH as axa>- 

factor. For example, the polypeptide having the amino acid sequence set forth in SEQ ID 
NO: 141 is a polypeptide having malonyl-CoA reductase activity that uses NADPH as a 
co-factor when converting malonyl-CoA into 3-HP. Likewise, polypeptides having 
malonyl-CoA reductase activity can use NADH as a co-factor. Such polypeptides can be 

10 obtained by converting a polypeptide that has malonyl-CoA reductase activity and uses 
NADPH as a cofactor into a polypeptide that has malonyl-CoA reductase activity and 
uses NADH as a cofactor. Any method can be used to convert a polypeptide that uses 
NADPH as a cofactor into a polypeptide that uses NADH as a cofactor such as those 
described by others (Eppink et al, J. Mol Bio/., 292(l):87-96 (1999), Hall and Tomsett, 

15 Microbiology, 146(Pt6):1399-406 (2000), and Dohref a/., Proc. Natl Acad. ScL 9 

98(l):81-86 (2001)). For example, mutagenesis can be used to convert the polypeptide 
encoded by the nucleic acid sequence set forth in SEQ ID NO: 140 into a polypeptide 
that, when converting malonyl-CoA into 3-HP, uses NADH as a co-factor instead of 
_ NADPH. 

20 As depicted in Figure 43, propionate can be converted into propionyl-CoA by a 

polypeptide having CoA synthetase activity such as the polypeptide having the sequence 
set forth in SEQ ID NO: 39; the resulting propionyl-CoA can be converted into acrylyl- 
GoA by a polypeptide having dehydrogenase activity such as the polypeptide having the 
sequence set forth in SEQ ID NO: 39; and the resulting acrylyl-CoA can be converted 

25 into (1) acrylate by a polypeptide having CoA transferase activity or CoA hydrolase 

activity, (2) 3-HP-CoA by a polypeptide having 3-HP dehydratase activity (also referred 
to as acrylyl-CoA hydratase or simply hydratase)" such as the polypeptide having the 
sequence set forth in SEQ ID NO:39, or (3) polymerized acrylate by a polypeptide having 
poly hydroxyacid synthase activity. The resulting acrylate can be converted into an ester 

30 of acrylate by a polypeptide having lipase activity. The resulting 3 -HP-Co A can be 

converted into (1) 3-HP by a polypeptide having CoA transferase activity, a polypeptide 
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having 3-hydroxypropionyl-CoA hydrolase activity <EC 3.1 2.-), or a polypeptide having 
3-hydroxyisobutyryl-CoA hydrolase activity (EC 3.1.2.4), or (2) polymerized 3-HP by a 
polypeptide having poly hydroxyacid synthase activity (EC 2.3.1.-). 

As depicted in Figure 54, PEP can be converted into 0-alanine. P-alanine can be 
5 converted into p-alanyl-CoA through the use of a polypeptide having CoA transferase 
activity. P-alanyl-CoA can then be converted into acrylyl-CoA through the use of a 
polypeptide havmg p-^y 1 " 0 ^ ammonia activit y- Acrylyl-CoAcanthenbe 
converted into 3-HP-CoA through the use of a polypeptide having 3-HP-CoA dehydratase 

# 

activity, and a polypeptide having glutamate dehydrogenase activity can be used to 

10 convert 3-HP-CoA into 3-HP. 

As depicted in Figure 55, 3-HP can be made from ^-alanine by first contacting P- 
alanine with a polypeptide having 4,4-aminobutyrate aminotransferase activity to create 
malonate semialdehyde. The malonate semialdehyde can be converted into 3-HP with a 
polypeptide having 3-HP dehydrogenase activity or a polypeptide having 3- 

15 hydroxyisobutyrate dehydrogenase activity. 

m. Nucleic acid molecules and polypeptides 

The invention provides isolated nucleic acid that contains the entire nucleic acid 
_ sequence-set forthinSEQ ID NO:l, 9, 13..2S, 33,-34,36, 38, 40,42, 129, 140, 142, 162, 

20 or 163. In addition, the invention provides isolated-nucleic acid that contains a portion of 
the nucleic acid sequence set forth in SEQ ED NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 163. For example, the invention provides isolated nucleic acid that 
contains a 15 nucleotide sequence identical to any 15 nucleotide sequence set forth in 
SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163 including, 

25 without limitation, the sequence starting at nucleotide number 1 and ending at nucleotide 
number 15, the sequence starting at nucleotide number 2 and ending at nucleotide number 
16, the sequence starting at nucleotide number 3 and ending at nucleotide number 17, and 
so forth. It will be appreciated that the invention also provides isolated nucleic acid that 
contains a nucleotide sequence that is greater than 15 nucleotides (e.g., 16, 17, 18, 19, 20, 

30 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, or more nucleotides) in length and identical to any 
portion of the sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
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140, 142, 162, or 163. For example, the invention provides isolated nucleic acid that 
contains a 25 nucleotide sequence identical to any 25 nucleotide sequence set forth in 
SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163 including, 
without limitation, the sequence starting at nucleotide number 1 and ending at nucleotide 
5 number 25, the sequence starting at nucleotide number 2 and ending at nucleotide number 
26, the sequence starting at nucleotide number 3 and ending at nucleotide number 27, and 
so forth. Additional examples include, without limitation, isolated nucleic acids that 
contain a nucleotide sequence that is 50 or more nucleotides (e.g., 100, 150, 200, 250, 
300, or more nucleotides) in length and identical to any portion of the sequence set forth 
10 inSEQIDNO:l, 9, 17,25,33,34,36,38,40,42,129, 140, 142, 162, or 163. Such 
isolated nucleic acids can include, without limitation, those isolated nucleic acids 
containing a nucleic acid sequence represented in a single line of sequence depicted in 
Figure 6, 10, 14, 18, 22, 23, 25, 27, 29, 31, 39, 49, or 51 since each line of sequence 
depicted in these figures, with the possible exception of the last line, provides a 
15 nucleotide"^ bases. 

In addition, the invention provides isolated nucleic acid that contains a variation 
of the nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 
129, 140, 142, 162, or 163. For example, the invention provides isolated nucleic acid 
containing a nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 
20 40, 42H29, 140, 142, 162, or 163 that contains a single insertion, a single deletion, a 
single substitution, multiple insertions, multiple deletions, multiple substitutions, or any 
combination thereof (e.g., single deletion together with multiple insertions). Such 
isolated nucleic acid molecules can share at least 60, 65, 70, 75, 80, 85, 90, 95, 97, 98, or 
99 percent sequence identity with a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 
25 36, 38, 40, 42, 129, 140, 142, 162, or 163. 

The invention provides multiple examples of isolated nucleic acid that contains a 
variation of a nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 
40, 42, 129, 140, 142, 162, or 163. For example, Tigure 8 provides the sequence set forth 
in SEQ ID NO:l aligned with three other nucleic acid sequences. Examples of variations 
30 of the sequence set forth in SEQ ID NO:l include, without limitation, any variation of the 
sequence set forth in SEQ ID NO: 1 provided in Figure 8. Such variations are provided in 
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Figure 8 in that a comparison of the nucleotide (or lack thereof) at a particular position of 
the sequence set forth in SEQ ID NO:l with the nucleotide (or lack thereof) at the same 
aligned position of any of the other three nucleic acid sequences depicted in Figure 8 Xi*-> 
SEQ ID NOs:3, 4, and 5) provides a list of specific changes for the sequence set forth in 
5 SEQ ID NO: 1 . For example, the "a" at position 49 of SEQ ID NO:l can be substituted 
with an "c" as indicated in Figure 8, As also indicated in Figure 8, the M a" at position 590 
of SEQ ED NO:l can be substituted with a "atgg"; an "aaac" can be inserted before the 
V at position 393 of SEQ ID NO:l; or the "gaa" at position 736 of SEQ ID NO:l can be 
deleted. It will be appreciated that the sequence set forth in SEQ ID NO:l can contain 
1 0 any number of variations as well as any combination of types of variations. For example, 
the sequence set forth in SEQ ID NO:l can contain one variation provided in Figure 8 or 
more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations 
provided in Figure 8. It is noted that the nucleic acid sequences provided by Figure 8 can 
^ntcd^dyp^ptidi^havmg (^ Iffic^fCTase activity. The invention also provides 
1 5 isolated nucleic acid that contains a variant of a portion of the sequence set forth in SEQ 
ID NO: 1 as depicted in Figure 8 and described herein. 

Likewise, Figure 12 provides variations of SEQ ID NO:9 and portions thereof; 
Figure 16 provides variations of SEQ ID NO:17 and portions thereof; Figure 20 provides 
variations^! SEQ ID NO:2S BoaAsBS^^^ff^P^ Figure 32 provide variations of SEQ 
20 ID NO:40 and portions thereof; and Figure 53 provides variations of SEQ ID NO:140. 

The invention provides isolated nucleic acid that contains a nucleic acid sequence 
that encodes the entire aminojicid sequence set forth in SEQ ID NO:2, 10^18, 26, 35, 37, 
39, 41, 141, 160, or 161. In addition, the invention provides isolated nucleic acid that 
contains a nucleic acid sequence that encodes a portion of the amino acid sequence set 
25 forth in SEQ ID NO:2, 10, 18,26,35,37, 39,41, 141, 160, or 161. For example, the 

invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes 
a 15 amino acid sequence identical to any 15 amino acid sequence set forth in SEQ ID 
NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161 including, without limitation, the 
sequence starting at amino acid residue number 1 and ending at amino acid residue 
30 number 15, the sequence starting at amino acid residue number 2 and ending at amino 
acid residue number 16, the sequence starting at amino acid residue number 3 and ending 
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at amino acid residue number 17, and so forth. It will be appreciated that the invention 
also provides isolated nucleic acid that contains a nucleic acid sequence that enpodes an 
amino acid sequence that is greater than 15 amino acid residues (e.g., 16, 17^ 18, 19, 20, 
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more amino acid residues) in length and identical 

5 to any portion of the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 
160, or 161. For example, the invention provides isolated nucleic acid that contains a 
nucleic acid sequence that encodes a 25 amino acid sequence identical to any 25 amino 
acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161 
including, without limitation, the sequence starting at amino acid residue number 1 and 

10 ending at amino acid residue number 25, the sequence starting at amino acid residue 
number 2 and ending at amino acid residue number 26, the sequence starting at amino 
acid residue number 3 and ending at amino acid residue number 27, and so forth. 
Additional examples include, without limitation, i&ojated nucleic acids that contain a 
nucleic acid sequence that encodes an amino acid sequence that is SO or more amino acid 

1 5 residues (e.g., 1 00, 150, 200, 250, 300, or more amino acid residues) in length and 

identical to any portion of the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 
41, 141, 160, or 161. Such isolated nucleic acids can include, without limitation, those 
isolated nucleic acids containing a nucleic acid sequence that encodes an amino acid 
sequence represented in a single line of sequence depicted in Figure 7, 1 1, 15, 19, 24, 26, 

20 28, 30, or 50 since each line of sequence depicted in these figures, with the possible 

exception of the last line, provides an amino acid sequence containing at least 50 residues. 

In addition, the invention provides isolated nucleic acid that contains a nucleic 
acid sequence that encodes an amino acid sequence having a variation of the amino acid 
sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For 

25 example, the invention provides isolated nucleic acid containing a nucleic acid sequence 
encoding an amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 
141, 160, or 161 that contains a single insertion, a single deletion, a single substitution, 
multiple insertions, multiple deletions, multiple substitutions, or any combination thereof 
(e.g., single deletion together with multiple insertions). Such isolated nucleic acid 

30 molecules can contain a nucleic acid sequence encoding an amino ac}d sequence that 

shares at least 60, 65, 70, 75, 80, 85, 90, 95, 97, 98, or 99 percent sequence identity with a 
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sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. 

The invention provides multiple examples of isolated nucleic acid containing a 
nucleic acid sequence encoding an amino acid sequence having a variation of an amino 
acid sequence set form in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. ; For 

5 example, Figure 9 provides the amino acid sequence set forth in SEQ ID NO:2 aligned 
with three other amino acid sequences. Examples of variations of the sequence set forth 
in SEQ ID NO:2 include, without limitation, any variation of the sequence set forth in 
SEQ ID NO:2 provided in Figure 9. Such variations are provided in Figure 9 in that a 
comparison of the amino acid residue,(or lack thereof) at a particular position of the 
1 0 sequence set forth in SEQ ID NO:2 with the amino acid residue (or lack thereof) at the 
same aligned position of any of the other three amino acid sequences of Figure 9 (i.e., 
SEQ ID NOs:6, 7, and 8) provides a list of specific changes for the sequence set forth in 
SEQ ID NO:2. For example, the "k" at position 17-of SEQ ID NO:2 can be substituted 
with a "p M or "h" as indicated in Figure 9. As also indicated in Figure 9, the V at 
15 position 125 of SEQ ID NO:2 can be substituted with an V or T. It will be appreciated 
that the sequence set forth in SEQ ID NO:2 can contain any number of variations as well 
as any combination of types of variations. For example, the sequence set forth in SEQ ID 
NO:2 can contain one variation provided in Figure 9 or more than one (e.g., 2, 3, 4, 5, 6, 
7,.8,_9_ 10^ii^0,-2_5,J50,jJ)9 J _o^ 11 *» 11016(1 

20 that the amino acid sequences provided in Figure 9 can be polypeptides having CoA 
transferase activity. 

T he in vention a lso provides isojated nucteic aci d L co ntajnin ga nu cleic acid 
sequence encoding an amino acid sequence that contains a variant of a portion of the 
sequence set forth in SEQ ID NO:2 as depicted in Figure 9 and described herein. 
25 Likewise, Figure 13 provides variations of SEQ ID NO:10 and portions thereof, 

Figure 17 provides variations of SEQ ID NO: 18 and portions thereof; Figure 21 provides 
variations of SEQ ID NO:26 and portions thereof; Figure 33 provides variations of SEQ 
ID NO:41 and portions thereof; Figures 40, 41, and 42 provide variations of SEQ ID 
NO:39; and Figure 52 provides variations of SEQ ID NO: 141 and portions thereof. 
30 It is noted that codon preferences and codon usage tables for a particular species 

can be used to engineer isolated nucleic acid molecules that take advantage of the codon 
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usage preferences of that particular species, for example, the isolated nucleic acid 
provided herein can be designed to have codons that are preferentially used by a 
particular organism of interest 

The invention also provides isolated nucleic acid that is at least about 12 bases in 

5 length (e.g., at least about 13, 14, 15, 16, 17, 18, 19,20, 25, 30,40,50, 60, 100, 250, 500, 
750, 1000, 1500, 2000, 3000, 4000, or 5000 bases in length) and hybridizes, under 
hybridization conditions, to the sense or antisense strand of a nucleic acid having the 
sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, 
or 163. The hybridization conditions can be moderately or highly stringent hybridization 

10 conditions. 

The invention provides polypeptides that contain the entire amino acid sequence 
set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. In addition, the 
invention provides polypeptides that contain a portion of the amino acid sequence set 
forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For example, the 

1 5 invention provides polypeptides that contain a 15 amino acid sequence identical to any 15 
amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 
161 including, without limitation, the sequence starting at amino acid residue number 1 
and ending at amino acid residue number 15, the sequence starting at amino acid residue 
number 2 and ending at amino acid residue number 16, the sequence starting at amino 

20 acid residue number 3 and ending at amino acid residue number 17, and so forth. It will 
be appreciated that the invention also provides polypeptides that contain an amino acid 
sequence that is greater than 15 amino acid residues <e.g., 16, 17, 18, 19, 20, 21, 22, 23, 
24725, 26," 27, 28729; 30, or more amino acTd residues) in lengfli and identical to any 
portion of the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 

25 161. For example, the invention provides polypeptides that contain a 25 amino acid 
sequence identical to any 25 amino acid sequence set forth in SEQ ID NO:2, 10, 1 8, 26, 
35, 37, 39, 41, 141, 160, or 161 including, without limitation, the sequence starting at 
amino acid residue number 1 and ending at amino acid residue number 25, the sequence 
starting at amino acid residue number 2 and ending at amino acid residue number 26, the 

30 sequence starting at amino acid residue number 3 and ending at amino acid residue 

number 27, and so forth. Additional examples include, without limitation, polypeptides 
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that contain an amino acid sequence that is 50 or more amino acid residues {e.g., 100, 
1 50, 200, 250, 300, or more amino acid residues) in length and identical to any portion of 
the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. Such 
polypeptides can include, without limitation, those polypeptides containing a amino acid 
5 sequence represented in a single line of sequence depicted in Figure 7, 1 1, 15, 19, 24, 26, 
28, 30, or 50 since each line of sequence depicted in these figures, with the possible 
exception of the last line, provides an amino acid, sequence containing at least 50 residues. 

In addition, the invention provides polypeptides that an amino acid sequence 
having a variation of the amino acid sequence set forth in SEQ ID NO:2, 10, 1 8, 26, 35, 
10 37, 39, 41, 141, 160, or 161 . For example, the invention provides polypeptides 

containing an amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 
141, 160,-br 161 that contains a single insertion, a single deletion, a single substitution, 
multiple insertions, multiple deletions, multiple substitutions, or any combination thereof 
{ e.g., single deletion together with multiple insertio ns). Such polypeptides can contain an 
15 amino acid sequence that shares at least 60, 65, 70, 75, 80, 85, 90, 95, 97, 98, or 99 

percent sequence identity with a sequence set forth in SEQ ID NO:2, 10, 18,56, 35, 37, 
39, 41, 141,1*0, or 161. 

The invention provides multiple examples of polypeptides containing an amino 
a cid sequence having a variation of an amino acid sequence set forth in SEQ ID NO:2, 
20 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For example, Figure 9 provides the amino 
acid sequence set forth in SEQ ID NO:2 aligned with three other amino acid sequences. 
Examples of variations of the sequence set forth in SEQ ID NO:2 include, without 
limitation, any variation of the sequence set forth in SEQ ID NO:2 provided in Figure 9. 
Such variations are provided in Figure 9 in that a comparison of the amino acid residue 
25 (or lack thereof) at a particular position of the sequence set forth in SEQ ID NO:2 with 
the amino acid residue (or lack thereof) at the same aligned position of any of the other 
three amino acid sequences of Figure 9 (i.e., SEQ ID NOs:6, 7, and 8) provides a list of 
specific changes for the sequence set forth in SEQ ID NO:2. For example, the <4 k" at 
position 17 of SEQ ID NO:2 can be substituted with a "p" or c< h w as indicated in Figure 9. 
30 As also indicated in Figure 9, the V at position 125 of SEQ ID NO:2 can be substituted 
with an "i" or "f\ It will be appreciated that the sequence set forth in SEQ ID NO:2 <an 
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contain any number of variations as well as any combination of types of variations. For 
example, the sequence set forth in SEQ ID NO:2 can contain one variation provided in 
Figure 9 or more than one<e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of 
the variations provided in Figure 9. It is noted that the amino acid sequences provided in 

5 Figure 9 can be polypeptides having CoA transferase activity. 

The invention also provides polypeptides containing an amino acid sequence that 
contains a variant of a portion of the sequence set forth in SEQ ID NO:2 as depicted in 
Figure 9 and described herein. 

Likewise, Figure 13 provides variations of SEQ ID NO:10 and portions thereof, 

10 Figure 17 provides variations of SEQ ID NO:18 and portions thereof; Figure 21 provides 
variations of SEQ ID NO:26 and portions thereof; Figure 33 provides variations of SEQ 
ID NO:41 and portions thereof, Figures 40, 41 , and 42 provide variations of SEQ ID 
NO:39; and Figure 52 provides variations of SEQ ID NO:141 and portions thereof. 

Polypeptides having a variant amino acid sequence can retain enzymatic activity. 

15 Such polypeptides can be produced by manipulating the nucleotide sequence encoding a 
polypeptide using standard procedures such as site-directed mutagenesis or POL One 
type of modification includes the substitution of one or more amino acid residues for 
amino acid residues having a similar biochemical property. For example, a polypeptide 
can have an amino acid sequence set forth in SEQiD NO:2, 10, 18, 26, 35, 37, 39, 41, 

20 141, 160, or 161 with one or more conservative substitutions. 

More substantial changes can be obtained by selecting substitutions that are less 
conservative than those in Table 1 , i.e., selecting residues that differ more significantly in 
their effect-on maintaining: (a) the structure of the polypeptide backbone in the area of the 
substitution, for example, as a sheet or helical conformation; (b) the charge or 

25 hydrophobicity of the polypeptide at the target site; or (c) the bulk of the side chain. The 
substitutions that in general are expected to produce the greatest changes in polypeptide 
function are those in which: (a) a hydrophilic residue, e.g., serine or threonine, is 
substituted for (or by) a hydrophobic residue, e.g., leucine, isoleucine, phenylalanine, 
valine or alanine; (b) a cysteine or proline is substituted for {or by) any other residue;<c) 

30 a residue having an electropositive side chain, e.g., lysine, arginine, or histidine, is 

substituted for (or by) an electronegative residue, glutamic acid or aspartic acid; or 
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(d) a residue having a bulky side chain, *.g., phenylalanine, is substituted for (or by) one 
not having a side chain, e.g., glycine. The effects of these amino acid substitutions (or 
other deletions or additions) can be assessed for polypeptides having enzymatic activity 
by analyzing the ability of the polypeptide to catalyze the conversion of the same • 

5 substrate as the related native polypeptide to the same product as the related native 

polypeptide. Accordingly, polypeptides having 5,10, 20, 30, 40, 50 or less conservative 
substitutions are provided by the invention. 

Polypeptides and nucleic acid encoding polypeptide can be produced by standard 
DNA mutagenesis techniques, for example, Ml 3 primer mutagenesis. Details of these 

10 techniques are provided in Sambrook et al. (ed.), Molecular Cloning: A Laboratory 
Manual 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring, Harbor, 
N.Y., 1989, Ch. 15. Nucleic acid molecules can contain changes of a coding region to lit 
the codon usage bias of the particular organism into which the molecule is to be 
introduced. 

1 5 Alternatively, the coding region can be altered by taking advantage of the 

degeneracy of the genetic code to alter the coding sequence in such a way that, while the 
nucleic acid sequence is substantially altered, it nevertheless encodes a polypeptide 
having an amino acid sequence identical or substantially similar to the native amino acid 
sequence. For example, the ninth amino acid residue of the sequence set forth in SEQ ID 

20 NO: 2 is alanine, which is encoded in the open reading frame by the nucleotide codon 

triplet GCT. Because of the degeneracy of the genetic code, three other nucleotide codon 
triplets— GC A, GCC, and GCG -also code for alanine. Thus, the nucleic acid sequence 
of the open reading frame can be changed at this position to any of these three codons 
without affecting the amino acid sequence of the encoded polypeptide or the 

25 characteristics of the polypeptide. Based upon the degeneracy of the geneticxx>de, 

nucleic acid variants can be derived from a nucleic acid sequence disclosed herein using a 
standard DNA mutagenesis techniques as described herein, or by synthesis of nucleic acid 
sequences. Thus, this invention also encompasses nucleic acid molecules that encode the 
same polypeptide but vary in nucleic acid sequence by virtue of the degeneracy of the 

30 genetic code. 
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IV. Methods of Making 3-HP and Other Organic Acids 

Each step provided in the pathways depicted in Figures 1-5, 43-44, 54, and 55 <*n 

be performed within a cell (in vivo) or outside a cell (in vitro, e.g., in a container or 

column). Additionally, the organic acid products can be generated through a combination 
5 of in vivo synthesis and in vitro synthesis. Moreover, the in vitro synthesis step, or steps, 

can be via chemical reaction or enzymatic reaction. 

For example, a microorganism provided herein can be used to perform the steps 

prodded in^FguireT or an extract containingpolypeptides having the indicated 

enzymatic activities can be used to perform the steps provided in Figure 1 . In addition, 
10 chemical treatments can be used to perform the conversions provided in Figures 1-5, 43- 

44, 54, and 55. For example, acrylyl-CoA can be converted into acrylate by hydrolysis. 

Other chemical treatments include, without limitation, trans esterification to convert 

acrylate into an acrylate ester. 

Carbon sources suitable as starting points for byconversion include carbohydrates 
1 5 ancfsynthetic mtCTmeffiates. Examples of carbohydrates which cells are capable of 

metabolizing to pyruvate include sugars such as dextrose, triglycerides, and fatty acids. 

Additionally, intermediate chemical products can be starting points. For example, 

acetic acid and carbon dioxide can be introduced into a fermentation broth. Acetyl-CoA, 

. malonyl-CoA, and 3-HP can be sequentially produced using a polypeptide having CoA 
20 synthase activity, a polypepSde having acetyl-CoA carboxylase activity, and a 

polypeptide having malonyl-C6A reductase activity^" Other useful intermediate chemical 

starting points can include propionic acid, acrylic acid, lactic acid, pyruvic acid, and {J- 

alanine. 

25 A. Expression of Polypeptides 

The polypeptides described herein can be produced individually in a host cell or in 
combination in a host cell. Moreover, the polypeptides having a particular enzymatic 
activity can be a polypeptide that is either naturally-occurring or non-naturally-occurring. 
A naturally-occurring polypeptide is any polypeptide having an amino acid sequence as 
30 found in nature, including wild-type and polymorphic polypeptides. Such naturally- 
occurring polypeptides can be obtained from any species including, without limitation, 

37 



BNSOOC1D: <WO 02424 18A2J_> 



WO 02/42418 



PCT/US01/43607 



animal (e.g., mammalian), plant, fungal, and bacterial species. A non-naturally-occurring 
polypeptide is any polypeptide having an amino acid sequence that is not found in nature. 
Thus, a non-naturally-qccurring polypeptide can be a mutated version of a naturally- 
occurring polypeptide, or an engineered polypeptide. For example, a non-naturally- 
5 occuiring polypeptide having 3-hydroxypropionyl-CoA dehydratase activity can be a 
mutated version of a naturally-occurring polypeptide having 3-hydroxypropionyl-CoA 
dehydratase activity that retains at least some 3-hydroxypropionyl-CoA dehydratase 
activity. A polypeptide can be mutated by, for example, sequence additions, deletions, 
substitutions, or combinations thereof. 
10 The invention provides genetically modified cells that can be used to perform one 

or more steps of the steps in the metabolic pathways described herein or the genetically 
modified cells can be used to produce the disclosed polypeptides for subsequent use in 
vitro. For example, an individual microorganism-can contain exogenous nucleic acid 
-«u6h-tfaat each-of ^-polypeptides nec^ary4o4>erfornaihe-steps depictedinEigures 1, 2, 
15 3, 4, 5, 43, 44, 54, or 55 are expressed, It is important to note that such cells can contain 
any number of exogenous nucleic acid molecules. For example, a particular cell can 
contain six exogenous nucleic acid molecules with each one encoding one of the six 
polypeptides necessary to convert lactate into 3-HP as depicted in Figure 1, or a particular 
cell can endogenously-produce^olypeptides necessary to convert lactate into acrylyl-CoA 
20 while containing exogenous nucleic acid that encodes polypeptides necessary to convert 
acrylyl-CoA into 3-HP. 

In addition^a.single expgenous^ucleic a^ one prmore 

than one polypeptide. For example, a single exogenous nucleic acid molecule can contain 
sequences that encode three different polypeptides. Further, the cells described herein 
25 can contain a single copy, or multiple copies (e.g., about 5, 10, 20, 35, 50, 75, 100 or 150 
copies), of a particular exogenous nucleic acid molecule. For example, a particular cell 
can contain about 50 copies of the constructs depicted in Figure 34, 35, 36, 37, 38, or 45. 
Again, the cells described herein can. contain more than one particular exogenous nucleic 
acid molecule. For example, a particular cell can contain about 50 copies of exogenous 
30 nucleic acid molecule X as well as about 75 copies of exogenous nucleic acid molecule 
Y. 
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In another embodiment, a cell within the scope of the invention can contain an 
exogenous nucleic acid molecule that encodes a polypeptide having 3-hydroxypropionyi- 
CoA dehydratase activity. Such cells can have any level of 3-hydroxypropionyl-CoA 
dehydratase activity. For example, a cell containing an exogenous nucleic acid molecule 

5 that encodes a polypeptide having 3-hydroxypropionyl-CoA dehydratase activity can 
have 3-hydroxypropionyl-CoA dehydratase activity with a specific activity greater than 
about 1 mg 3-HP-CoA formed per gram dry cell weight per hour {e.g., greater than about 
10, 20, 30, 40, 50; 60, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400, 500, or more 
mg 3 -HP-Co A formed per gram dry cell weight per hour). Alternatively, a cell can have 

10 3-hydroxypropionyl-CoA dehydratase activity such that a cell extract from lxlO 6 cells 
has a specific activity greater than about 1 jig 3-HP-CoA formed per mg total protein per 
10 minutes (e.g., greater than about 10, 20, 30, 40, 50,30, 70, 80, 90, 100, 125, 150, 200, 
250, 300, 350, 400, 500, or more ng 3-HP-CoA fonned per mg total protein per 10 
minutes). 

15 A nucleic acid molecule encoding a polypeptide having enzymatic activity can be 

identified and obtained using any method such as those described herein. For example, 
nucleic acid molecules that encode a polypeptide having enzymatic activity can be 
identified and obtained using common molecular cloning or chemical nucleic acid 
synthesis procedures and techniques, including PCR. In addition, standard nucleic acid 

20 sequencing techniques and software programs that translate nucleic acid sequences into 
amino acid sequences based on the genetic code can be used to determine whether or not 
a particular nucleic acid has any sequence homology with known enzymatic polypeptides. 
Sequence alignment software such as MEG ALIGN® <DN ASTAR, Madison, WI, 1997) 
can be used to compare various sequences. In addition, nucleic acid molecules encoding 

25 known enzymatic polypeptides can be mutated using common molecular cloning 
techniques (e.g., site-directed mutagenesis). Possible mutations include, without 
limitation, deletions, insertions, and base substitutions, as well as combinations of 
deletions, insertions, and base substitutions. Further, nucleic acid and amino acid 
databases (e.g., GenBank®) can be used to identify a nucleic acid sequence that encodes a 

30 polypeptide having enzymatic activity. Briefly, any amino acid sequence having some 
homology to a polypeptide having enzymatic activity, or any nucleic acid sequence 
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having some homology to a sequence encoding a polypeptide having enzymatic activity 
can be used as a query to search GenBank* The identified polypeptides then can be 
analyzed to determine whether or not they exhibit enzymatic activity. 

In addition, nucleic acid hybridization techniques can be used to identify and 
5 obtain a nucleic acid molecule that encodes a polypeptide having enzymatic activity. 
Briefly, any nucleic acid molecule that encodes a known enzymatic polypeptide, or 
fragment thereof, can be used as a probe to identify a similar nucleic acid molecules by 
hybridization under conditions of moderate to high stringency. Such similar nucleic acid 
molecules then can be isolated, sequenced, and analyzed to determine whether the 
10 encoded polypeptide has enzymatic activity. 

Expression cloning techniques also can be used to identify and obtain a nucleic 
acid molecule that encodes a polypeptide having enzymaticactivity; For example, a 
substrate known to interact with a particular enzymatic polypeptide can be used to screen 
-a phage display library containing that enzymatic polypeptide. Phage display libraries 
15 can be generated as described elsewhere (Burritt et <d.,AndL Bidchem. 238:1-13 (1990)), 
or can be obtained from commercial suppliers such as Novagen (Madison, WI). 

Further, polypeptide sequencing techniques can be used to identify and obtain a 
nucleic acid molecule that encodes a polypeptide having enzymatic activity. For 
-^^le r a-purif4ed-poly^eptidex an be separ ated by E le c trop h ore s i s, a nd its amino 
20 acid sequence determined by, for example, amino acid microsequencing techniques. 
Once determined, the amino acid sequence can be used to design degenerate 
oligonucleotide primers. Degenerate oligonucleotide primers can be used to obtain the 
nucleic acid encoding the polypeptide by PCR. Once obtained, the nucleic acid can be 
sequenced, cloned into an appropriate expression vector, and introduced into a 
25 microorganism. 

Any method can be used to introduce an exogenous nucleic acid molecule into a 
cell. In fact, many methods for introducing nucleic acid into microorganisms such as 
bacteria and yeast are well known to those skilled in the art For example, heat shock, 
lipofection, electroporation, conjugation, fusion of protoplasts, and biolistic delivery are 
30 common methods for introducing nucleic acid into bacteria and yeast cells. See, e.g., Ito 
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et al. y J. Bacterol. 153:163-168 (1983); Dunens et al, Curr. Genet 18:7-12 (1990); and 
Becker and Guarente, Methods in Enzymology 194:182-187 (1991). 

An exogenous nucleic acid molecule contained within a particular cell of the 
invention can be maintained within that cell in any form. For example, exogenous 
5 nucleic acid molecules can be integrated into the genome of the cell or maintained in an 
episomal state. In other words, a cell of the invention can be. a stable or transient 
transformant Again, a microorganism described herein can contain a single-copy, or 
multiple copies (e.g., about 5, 10, 20, 35, 50, 75, 100 or 150 copies), of a particular 
exogenous nucleic acid molecule as described herein. 

10 Methods for expressing an amino acid sequence from an exogenous nucleic acid 

molecule are well known to those skilled in the art Such methods include, without 
limitation, constructing a nucleic acid such that a regulatory element promotes the 
expression of a nucleic acid sequence that encodes^ polypeptide. Typically, regulatory 
elements are DNA sequences that regulate the expression of other DNA sequences at the 

15 level of transcription. Thus, regulatory elements include, without limitation, promoters, 
enhancers, and the like. Any type of promoter can be used to express an amino acid 
sequence from an exogenous nucleic acid molecule. Examples of promoters include, 
without limitation, constitutive promoters, tissue-specific promoters, and promoters 
responsive or unresponsive to a particular stimulus (e.g., light; oxygen, chemical 

20 concentration, and the like). Moreover/methods for expressing a polypeptide from an 
exogenous nucleic acid molecule in cells such as bacterial cells and yeast cells are well 
known to those skilled in the art For example, nucleic acid constructs that are capable of 
expressing exogenous polypeptides within E. coli are well known. See, e.g., Sambrook et 
a/., Molecular cloning: a laboratory manual, Cold Spring Harbour Laboratory Press/New 

25 York, USA, second edition (1989). 

B. Production of Organic Acids and Related Products via Host Cells 

The nucleic acid and amino acid sequences provided herein can be used with cells 
to produce 3-HP and/or other organic compounds such as 1,3-propanediol, acrylic acid, 
30 polymerized acrylate, esters of acrylate, esters of 3-HP, and polymerized 3-HP. Such 
cells can be from any species including those listed within the taxonomy web pages at the 
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National Institute of Health sponsored by the United States government 
(www.ncbi.nlm.nih.gov). Thecellscanbeeukaryoticorprokaryotic. For example, 
genetically modified cells of the invention can be mammalian cells (e.g., human, murine, 
and bovine cells); plant cells (e.g., corn, wheat, rice, and soybean cells), fungal cells <e.g., 
5 Aspergillus and Rhizopus cells), yeast -cells, or bacterial cells <e.g., Lactobacillus, 

Lactococcus, Bacillus, Escherichia, and Clostridium cells). A cell of the invention also 
ean be a microorganism, The term "microorganism" as usedierein refers to any 
microscopic organism including, without limitation, bacteria, algae, fungi, and protozoa. . 
Thus, E. coli, S. cerevisiae, Kluveromyces lactis, Candida blankii, Candida rugosa, and 
10 Pichia posioris are considered microorganisms and can be used as described herein 
Typically, a cell of the invention is genetically modified such that a particular 
organic compoundis producedr-In one^mbodimentrtiie inventionprovides cells that 
make 3 -HP from PEP. Examples biosynthetic pathways that -cay be used by cells to make 

^HB-ais^MminJig ure s J -S ,-4 3t44, 54, an &SS. — — 

15 Generally, cells that are genetically modified to synthesize a particular organic 

compound contain one or more exogenous nucleic acid molecules that encode 
polypeptides having specific enzymatic activities. For example, a microorganism can 
contain exogenous nucleic acid that encodes a polypeptide having 34iydroxypropionyl- 

£^A r dehyjdtatase activit y Tn fhis .case, acry lylTCoAjca n he converted into 3- 

20 hydroxypropionic acid-CoA which can lead to the productionof 3rHP- It is noted that a 
cell can be given an exogenous nucleic acid molecule that encodes a polypeptide having 
an enz ymatic activity that catalyzes the production of a compound not normally produced 
by that cell. Alternatively, a cell can be given an exogenous nucleic acid molecule that 
encodes a polypeptide having an enzymatic activity that catalyzes the production of a 
25 compound that is normally produced by that cell. In this case, the genetically modified 
cell can produce more of the compound, or can produce the compound more efficiently, 
than a similar cell not having the genetic modification. 

In one embodiment, die invention provides a cell containing an exogenous nucleic 
acid molecule that encodes a polypeptide having enzymatic activity that leads to the . 
30 formation of 3 -HP. It is noted that the produced 3-HP can be secreted from the cell, 
eliminating the need to disrupt cell membranes to retrieve the organic compound. 
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Typically, the cell of the invention produces 3-HP with the concentration being at least 
about 100 mg per L (e.g., at least about 1 g/L, 5 g/L, 10 g/L, 25 g/L, 50 g/L, 75 g/L, 80 
g/L, 90 g/L, 100 g/L, or 120 g/L). When determining the yield of an organic compound 
such as 3-HP for a particular cell, any method can be used. See, e.g., Applied 

5 Environmental Microbiology 59(12):426 1-4265 (1993). Typically, a cell within the scope 
of the invention such as a microorganism catabolizes a hexose carbon source such as 
glucose, A cell, however, can catabolize a variety of carbon sources such as pentose 
sugars (e.g., ribose, arabinose, xylose, and lyxose), fatty acids, acetate, or glycerols. In 
other words, a cell within the scope of the invention can utilize a variety of carbon . 

10 sources. 

As described herein, a cell within the scope of the invention can contain an 
exogenous nucleic acid molecule that encodes a polypeptide having enzymatic activity 
that leads to the formation of 3-HP or other organic compounds such as 1,3 -propanediol, 
acrylic acid, poly-acrylate, acrylate-esters, 3-HP-esters r and poly-3-HP. Methods of 

1 5 identifying cells that contain exogenous nucleic acid are well known to those skilled in 
the art Such methods include, without limitation, PCR and nucleic acid hybridization 
techniques such as Northern and Southern analysis {see hybridization described herein). 
In some cases, immunohisto-chemistry and biochemical techniques can be used to 
determine if a cell contains a particular nucleic acid by detecting the expression of the 

20 polypeptide encoded by that particular nucleic acid molecule. For example, an antibody 
having specificity for a polypeptide can be used to determine whether or not a particular 
cell contains nucleic acid encoding that polypeptide. Further, biochemical techniques can 
be used to determine if a cell coritains^'p^ molecule encoding a 

polypeptide having enzymatic activity by detecting an organic product produced as a 

25 result of the expression of the polypeptide having enzymatic activity. For example, 

detection of 3-HP after introduction of exogenous nucleic acid that encodes a polypeptide 
having 3-hydroxypropionyi-CoA dehydratase activity into a cell that does not normally 
express such a polypeptide can indicate that that cell not only contains the introduced 
exogenous nucleic acid molecule but also expresses the encoded polypeptide from that 

30 introduced exogenous nucleic acid molecule. Methods for detecting specific enzymatic 
activities or the presence of particular organic products are well known to those skilled in 
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the art. For example, the presence of an organic compound such as 3 -HP can be 
determined as described elsewhere. See, Sullivan and Clarke, J. Assoc. Offit. Agr. 
Chemists, 38:514-518 (1955). 

5 C. Cells with Reduced Polypeptide Activity 

The invention also provides genetically modified cells having reduced polypeptide 
activity. The term "reduced" as used herein with respect to a cell and a particular 
polypeptide's activity refers to a lower level of activity than that measured in a 
comparable cell of the same species. For example, a particular microorganism lacking 

10 enzymatic activity X is considered to have reduced enzymatic activity X if a comparable 
microorganism has at least some enzymatic activity X. It is noted that a cell can have the 
activity of any type of polypeptide reduced including, without limitation, enzymes, 
transcription factors, transporters, receptors, signal molecules, and the like. For example, 
acell can contain an exogenous nucleic acid molecule mat disrupts a regulatory and/or 

15 coding sequence of a polypeptide having pyruvate decarboxylase activity or alcohol 
dehydrogenase activity. Disrupting pyruvate decarboxylase and/or alcohol 
dehydrogenase expression can lead to the accumulation of lactate as well as products 
produced from lactate such as 3-HP, 1,3-propanediol, acrylic acid, poly-acrylate, acrylate- 
esters, 3-HP-esters, and poly-3-HP. It is also noted that reduced polypeptide activities 

20 can be the result of lower polypeptide concentration, lower specific activity of a 

polypeptide, or combinations thereof. Many different methods can be used to make a cell 
having reduced polypeptide activity. For example, a cell can be engineered to have a 
disrupted regulatory sequence or polypeptide-encoding sequence using common 
mutagenesis or knock-out technology. See, e.g., Methods in Yeast Genetics (1997 

25 edition), Adams, Gottschling, Kaiser, and Sterns, Cold Spring Harbor Press (1998). 
Alternatively, antisense technology can be used to reduce the activity of a particular 
polypeptide. For example, a cell can be engineered to contain a cDNA that encodes an 
antisense molecule that prevents a polypeptide from being translated. The term 
"antisense molecule" as used herein encompasses any nucleic acid molecule or nucleic 

30 acid analog"(e.g., peptide nucleic acids) that contains a sequence that corresponds to fixe 
coding strand of an endogenous polypeptide. An antisense molecule also can have 
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flanking sequences (e.g., regulatory sequences). Thus, antisense molecules can be 
ribozymes or antisense oligonucleotides. A ribozyme can have any general structure 
including, without limitation, hairpin, hammerhead, or axhead structures, provided the 
molecule cleaves RNA. Further, gene silencing can be used to reduce the activity of a 
5 particular polypeptide. 

A cell having reduced activity of a polypeptide can be identified using any 
method. For example, enzyme activity assays such as those described herein can be used 
to identify cells having a reduced enzyme activity. 

A polypeptide having (1) the amino acid sequence set forth in SEQ ID NO:39 {the 

10 OS 1 7 polypeptide) or (2) an amino acid sequence sharing at least about 60 percent 

sequence identity with the amino acid sequence set forth in SEQ ID NO:39 can have three 
functional domains: a domain having CoA-synthatase activity, a domain having 3-HP- 
Co A dehydratase activity, and a domain having Co'A-reductase activity. Such 
p olypeptides can be selectively modified by mutating and/or deleting domains such that 

1 5 one or two of the enzymatic activities are reduced. Reducing the dehydratase activity of 
the OS 1 7 polypeptide can cause acrylyl-Co A to be created from propionyl-CoA. The 
acrylyl-CoA then can be contacted with a polypeptide having CoA hydrolase activity to 
produce acrylate from propionate (Figure 43). Similarly, acrylyi-CoA can be created 
from 3-HP by using, for example, an OS 17 polypeptide having reduced reductase 

20 activity. 

D. Production of Organic Acids and Related Products via In Vitro 
Techniques 

In addition, purified polypeptides having enzymatic activity can be used alone or 
25 in combination with cells to produce 3-HP or other organic compounds such as 1,3- 
propanediol, acrylic acid, polymerized acrylate, esters of acrylate, esters of 3-HP, and 
polymerized 3-HP. For example, a preparation containing a substantially pure 
polypeptide having 3-hydroxypropionyl-CoA dehydratase activity can be used to catalyze 
the formation of 3-HP-CoA, a precursor to 3-HP. Further, cell-free extracts containing a 
30 polypeptide having enzymatic activity can be used alone or in combination with purified 
polypeptides and/or cells to produce 3-HP. For example, a cell-free extract containing a 
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polypeptide having .CoA transferase activity can be used to form lactyl-CoA, while a 
microorganism containing polypeptides have the enzymatic activities necessary to 
catalyze the reactions needed to form 3-HP from lactyl-CoA can be used to produce 3- 
HP. Any method can be used to produce a cell-free extract. For example, osmotic shock, 

5 sonkation, and/or a repeated freeze-thaw cycle followed by filtration and/or 
centrifugation can be used to produce a cell-free extract from intact cells. 

It is noted that a cell, purified polypeptide, and/or cell-free extract can be used to 
produce 3-HP that is, in turn, treated chemically to produce another compound. For 
example, a microorganism can be used to produce 3-HP, while a chemical process is used 

10 to modify 3-HP into a derivative such as polymerized 3-HP or an ester of 3-HP. 

Likewise, a chemical process can be used to produce a particular compound that is, in 
turn, converted into 3-HP or other organic compound <e.g., 1,3-propanediol, acrylic acid, 
polymerized acrylate, esters of acrylate, esters of 3-HP, and polymerized 3-HP) using a 
cell, substantially pure polypeptide, and/or cell-free extract described herein. For 

15 example, a chemical process can be used to produce acrylyl-CoA, while a microorganism 
can be used convert acrylyl-CoA into 3-HP. 

E. Fermentation ofCelb to Produce Organic Acids 
Typically, 3-HP is produced by providing a production cell, such as a 
20 microorganism, and culturing the microorganism with culture medium such that 3-HP is 
produced. In general, the culture media and/or culture conditions can be such that the 
microorganisms grow to an adequate density and produce 3-HP efficiently. For large- 
scale production processes, any method can be used such as those described elsewhere 
(Manual of Industrial Microbiology and Biotechnology, 2 nd Edition, Editors: A. L. 
25 Demain and J: E. Davies, ASM Press; and Principles of Fermentation Technology, P. F. 
Stanbury.and A. Whitaker, Pergamon). Briefly, a large tank (e.g., a 100 gallon, 200 
gallon, 500 gallon, or more tank) containing appropriate culture medium with, for 
example, a glucose carbon source is inoculated with a particular microorganism. After 
inoculation, the microorganisms are incubated to allow biomass to be produced. Once a 
30 desired biomass is reached, the broth containing the microorganisms can be transferred to 
a second tank. This second tank can be any size. For example, the second tank can be 
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larger, smaller, or the same size as the first tank. Typically, the second tank is larger than 
the first such that additional culture medium can be added to the broth from the first tank. 
In addition, the culture medium within this second tank can be the same as, or different 
from, that used in the first tank. For example, the first tank can contain medium with 

5 xylose, while the second tank contains medium with glucose. 

Once transferred, the microorganisms can be incubated to allow for the production 
of 3-HP. Once produced, any method can be used to isolate the 3-HP. For example, 
common separation techniques can be used to remove the biomass from the broth, and 
common isolation procedures (e.g., extraction, distillation, and ion-exchange procedures) 

10 can be used to obtain the 3-HP from the microorganism-free broth. In addition, 3-HP can 
be isolated while it is being produced, or it can be isolated from the broth after the 
product production phase has been terminated 

F. Products Created From the Disclosed^ iosynthetic Routes 
1 5 The organic compounds produced from any of the steps provided in Figures 1-5, 

43-44, 54, and 55 can be chemically converted into other organic compounds. For 
example, 3-HP can be hydrogenated to form 1,3 propanediol, a valuable polyester 
monomer. Hydrogenating an organic acid such as 3-HP can be performed using any 
method such as those used to hydrogenate succinic acid and/or lactic acid. For example, 
20 3-HP can be hydrogenated using a metal catalyst In another example, 3-HP can be 
dehydrated to form acrylic acid. Any method can be used to perform a dehydration 
reaction. For example, 3-HP can be heated in the presence of a catalyst (e.g., a metal or 
mineral acid catalyst) to form acrylic acid. Propanediol also can be created using 
polypeptides having oxidoreductase activity <e;g., enzymes is the 1.1.1 .- class of 
25 enzymes) in vitro or in vivo. 

V. Overview of Methodology Used to Create Biosynthetic Pathways 
That Make 3-HP from PEP 

The invention provides methods of making 3-HP and related products from PEP 
30 via the use of biosynthetic pathways. Illustrative examples include methods involving the 
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production of 3-HP via a lactate intermediate, a malonyl-CoA intermediate, and a B- 
alanine intermediate. 

A. Biosynthetic Pathway for Making 3-HP through a Lactic Acid 
5 Intermediate 

A biosynthetic pathway that allows for the production of 3-HP from PEP was 
constructed XFigpw"I}TTn»pffl»^"fir^^ii^ several polypeptides that were 
cloned and expressed as described herein. M elsdenii cells <ATCC 17753) were used as 
a source of genomic DNA. Primers were used to identify and clone a nucleic acid 
10 sequence encoding a polypeptide having CoA transferase activity (SEQ ID NO: 1). The 
polypeptide was subsequently tested for enzymatic activity and found to have CoA 
transferase activity. 

Similarly, PCR primers were used to identify nucleic acid sequences from M. 
— e^wigenomic DNAThat encoded an El activator,-E2 a, and E2 p polypeptides (SEQ 
15 ID NOs: 9, 1 7, and 25, respectively). These polypeptides were subsequently shown to 
have lactyi-CoA dehydratase activity. 

CMoroflexus aurantiacus cells (ATCC 29365) were used as a source of genomic 
DNA. Initial cloning lead to the identification of nucleic acid sequences: OS17<SEQID 
NO: 129) and OS 19 (SEQ ID NO: 40). Subsequence assays revealed that OS 17 encodes 
20 a polypeptide having CoA synthase activity, dehydratase activity, and dehydrogenase . 
activity (propionyl-CoA synthatase). Subsequence assays also revealed that OS19 
encodes a polypeptide having 3-hydroxypropionyl-CoA dehydratase activity (also 
referred to as acrylyl-CoA hydratase activity). 

Several operons were constructed for use in E. coli. These operons allow for die 
25 production of 3-HP in bacterial cells. Additional experiments allowed for the expression 
of these polypeptide is yeast, which can be used to produce 3-HP. 

B. Biosynthetic Pathway for Making 3-HP through a Malonyl-CoA 
Intermediate 

30 Another pathway leading to the production of 3-HP from PEP was constructed. 

This pathway used a polypeptide having acetyl CoA carboxylase activity that was isolated 
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from E. coli (Example 9), and a polypeptide having malonyl-CoA reductase activity that 
was isolated from Chloroflexus aurantacius (Example 10). The combination of these two 
polypeptides allows for the production of 3-HP from acetyl-CoA (Figure 44). 

Nucleic acid encoding a polypeptide having malonyl-Co A reductase activity (SEQ 
5 ID NO: 140) was cloned, sequenced, and expressed. The polypeptide having malonyl- 
CoA reductase activity was then used to make 3-HP. 

C Biosynthetic Pathways For Making 3-HP through a B-alanine 
Intermediate 

10 In general, prokaryotes and eukaryotes metabolize glucose via the Embden- 

Meyerhof-Parnas pathway to PEP, a central metabolite in carbon metabolism. The PEP 
generated from glucose is either carboxylated to oxlaoacetate or is converted to pyruvate. 
Carboxylation of PEP to oxaloacetatecan be catalyzed by a polypeptide having PEP 
carboxylase activity, a polypeptide having PEP carboxykinase activity, or a polypeptide 

15 having PEP transcarboxylase activity. Pyruvate that is generated from PEP by a 

polypeptide having pyruvate kinase activity can also be converted to oxaloacetate by a 
polypeptide having pyruvate carboxylase activity. 

Oxaloacetate generated either from PEP or pyruvate can act as a precursor for 
production of aspartic acid. This conversion can be carried out by a polypeptide having 

20 aspartate aminotransferase activity, which transfers an amino group from glutamate to 
oxaloacetate. Glutamate consumed in this reaction can be regenerated by the action of a 
polypeptide having glutamate dehydrogenase activity or by the action of a polypeptide 
having 4, 4-aminobutyrate aminotransferase activity. The decarboxylation of aspartate to 
P-alanine is catalyzed by a polypeptide having aspartate decarboxylase activity. P-alanine 

25 produced through this biochemistry can be converted to 3-HP via two possible pathways. 
These pathways are provided in Figures 54 and 55. 

The steps involved in the production of P-alanine can be the same for both 
pathways. These steps can be accomplished by endogenous polypeptides in the host cells 
which convert PEP to p-alanine, or these steps can be accomplished with recombinant 

30 DNA technology using known polypeptides such as polypeptides having PEP- 
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carboxykinase activity (4.1.1.32), aspartate aminotransferase activity X2.6. 1.1), and 
aspartate alpha-decaiboxylase activity (4.1.1.11). 

As depicted in Figure 54, a polypeptide having CoA transferase activity (e.g., a 
polypeptide having a sequence set forth in SEQ ID NO:2) can be used to convert 0- 

5 alanine to jJ-alanyl-CoA. p-alanyl-CoA can be converted to acrylyl-CoA via a 

polypeptide having fJ-alanyl-CoA ammonia lyase activity (e.g., a polypeptide having a 
sequence set forth in SEQ ID NO:160). Acrylyl-CoA can be converted to 3-HP-CoA 
using a polypeptide having 3-HP-CoA dehydratase activity (e.g., a polypeptide having a 
sequence set forth in SEQ ID NO:40). 3-HP-CoA can be converted into 3-HP via a 

1 0 polypeptide having CoA transferase activity (e.g., a polypeptide having a sequence set 
forth in SEQ ID NO:2). 

As depicted in Figure 55, a polypeptide having 4,4-aminobutyrate 
aminotransferase activity <2.6. 1.1 9) can be used to ..convert p-alanine into malonate 
semialdehyde. The malonate semialdehyde can be converted into 3-HP using either a 

15 polypeptide having 3-hydroxypropionate dehydrogenase activity {1.1.1.59) or a 
polypeptide having 3 -hy droxyisobutyrate dehydrogenase activity. 

EXAMPLES 
Example 1 - Cloning nucleic acid molecules that 
20 encode a polypeptide having CoA transferase activity 

Genomic DNA was isolated from Megasphaera elsdenii cells (ATCC 17753) 
grown in 1053 Reinforced Clostridium media under anaerobic conditions at 37°C in roll 
tubes for 12-14 hours. Once grown, the cells were pelleted, washed with 5 mL of a 10 
I mM Tris solution, and repelleted. The pellet was resuspended in 1 mL of Gentra Cell 
25 Suspension Solution to which 14.2 mg of lysozyme and 4 nL of 20 mg/mL proteinase K 
solution was added. The cell suspension was incubated at 37°C for 30 minutes. The 
genomic DNA was than isolated using a Gentra Genomic DNA Isolation Kit following 
die provided protocol. The precipitated genomic DNA was spooled and air-dried for 10 
minutes. The genomic DNA was suspended in 500 yL of a 1 0 mM Tris solution and 
30 stored at 4°C. 
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Two degenerate forward '(CoAFl and CoAF2) and three degenerate reverse 
(CoARl, CoAR2, and CoAR3) PCR primers were designed based on conserved 
acetoaeetyl CoA transferase and propionate CoA transferase sequences <CoAFl 5'- 
t3AAWSCGGYSCNATYGGYGG-3 SEQ ID NO: 49; CoAF2*5'-TTYTGYG- 

5 <}YRSBTTYACBGC WGG-3 SEQ ID NO: 50; CoARl S'-CCWGCVGTRAAV- 
S YRCCRCARAA-3 \ SEQ ID NO: 51; CoAR2 5 '-AARACDSMRCGTTCVGTRA- 
TRTA-3% SEQ ID NO: 52; and CoAR3 5*-TCRAYRCCSGGWGCRAYTTC-3\ SEQ ID 
NO: 53). The primers were used in all logical combinations in PCR using Taq 
polymerase (Roche Molecular Biochemicals, Indianapolis, IN) and 1 ng of genomic DNA 

10 per *iL reaction mix, PCR was conducted using a touchdown PCR program with 4 cycles 
at an annealing temperature of 59°C, 4 cycles at 57°C, 4 cycles at 55°C, and 18 cycles at 
52°C. Each cycle used aa initial 30-second denaturing step at 94°C and a 3 minute 
extension at 72°C. The program had an initial denaturing step for 2 minutes at 94°C and 
a final extension step of 4 minutes at 72°C. Time allowed for annealing was 45 seconds. 

1 5 The amounts of PCR primer used in the reactions were increased 2-8 fold above typical 
PCR amounts depending on the amount of degeneracy in the 3 * end of the primer. In 
addition, separate PCR reactions containing each individual primer were made to identify 
PCR products resulting from single degenerate primers. Each PCR product (25 pL) was 
separated by electrophoresis using a 1% TAE (Tris-acetate-EDTA) agarose gel. 

20 The CoAFl -CoAR2, CoAFl -CoAR3, CoAF2-CoAR2, and CoAF2-CoAR3 

combinations produced a band of 423, 474, 177, and 228 bp, respectively. These bands 
matched the sizes based on other CoA transferase sequences. No band was visible from 
the individual primer control reactions. The CoAFl -Co AR3 fragment (474 bp) was 
isolated and purified using a Qiagen Gel Extraction Kit (Qiagen inc., Valencia, CA). 

25 Four *iL of the purified band was ligated into pCRII vector and transformed into TOP10 
E. coli cells by heat-shock using a TOPO cloning procedure (Invitrogen, Carlsbad, CA). 
Transformations were plated on LB media containing 100 pgfrnL of ampicillin (Amp) 
and 50 jtg/mL of 5-Bromo-4-Chloro-3-Indolyl-B-D-Galactopyranoside (X-gal). Single, 
white colonies were plated onto fresh media and screened in a PCR reaction using the 

30 CoAFl and CoAR3 primers to confirm the presence of the insert. 
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Plasmid DNA obtained using a QiaPrep Spin Miniprep Kit (Qiagen, Inc) was 
quantified and used for DNA sequencing with M13R and M13F primers. Sequence 
analysis revealed that the CoAFl-CoAR3 fragment shared sequence similarity with 
acetoacetyl Co A transferase sequences. 

Genome walking was performed to obtain the complete coding sequence. The 
following primers for genome walking in both upstream and downstream directions were 
designed using the portion of the 474 bp CoAFl-CoAR3 fragment sequence that was 
internal to the degenerate primers (COAGSP1F 5'-GAATGTTTACTTCTGGGG- 
CACCTTCAC-3', SEQ ID NO:54; COAGSP2F 5'-GACCAGATCACTTTCAACG- 
10 GTTCCTATG-3', SEQ ID NO:55; COAGSP1R 5'-GCATAGGAACCGTTGAAA- 
GTG ATCTGG-3 ' , SEQ ED NO:56; and COAGSP2R 5'-GTTAGTACCGAACTTG- 
CTGACGTTG ATG-3 ' , SEQ ID NO:57). The COAGSP1F and COAGSP2F primers face 
downstream, while the COAGSP1R and CO AGSP2R primers face upstream. In addition, 
the COAGSP2F and COAGSP2R primers are nested inside the COAGSP1F and 
1 5 COAGSP1R primers. Genome walking was performed using the Universal Genome 
Walking kit (ClonTech Laboratories, Inc., Palo Alto, CA) with the exception that 
additional libraries were generated with enzymes Nru I, Sea I, and Hinc H. First round 
PGR was conducted in a Peridn Elmer 2400 Thermocycler with 7 cycles of 2 seconds at 
94°C and 3 minutes at 72°C, and 36 cycles of 2 seconds at 94°C and 3 minutes at 6S°C 
20 with a final extension at 65°C for 4 minutes. Second round PCR used 5 cycles of 2 

seconds at 94°C and 3 minutes at 72°C, and 20 cycles of 2 seconds at 94°C and 3 minutes 
at 65°C with a final extension at 65°C for 4 minutes. The first and second round product 
(20 uL) was separated by electrophoresis on a 1% TAE agarose gel. Amplification 
products were obtained with the Stu I library for the reverse direction. The second round 
25 product of 1.5 Kb from this library was gel purified, cloned, and sequenced. Sequence 
analysis revealed that the sequence derived from genome walking overlapped with the 
CoAFl-CoAR3 fragment and shared sequence similarity with other sequences such as 
acetoacetyl CoA transferase sequences (Figures 8-9). 

Nucleic acid encoding the CoA transferase (propionyl-CoA transferase or per) 
30 from Megasphaera elsdenii was PCR amplified from chromosomal DNA using following 
PCR program: 25 cycles of 95°C for 30 seconds to denature, 50°C for 30 seconds to 
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anneal, and 72°C for 3 minutes for extension {plus 2 seconds per cycle). The primers 
used were designated PCT-1.1 14 (S'-ATGAGAAAAGTAGAAATCATTAC-S'; SEQ ID 
NO:58) and PCT-2.204S (5 ' -GGGGG AAGTTGACGATAATG-3 * ; SEQ ID NO:59). The 
resulting PCR product {about 2 kb as judged by agarose gel electrophoresis) was purified 

5 using a Qiagen PCR purification kit (Qiagen Inc., Valencia, CA). The purified product 
was ligated to pETBlue-1 using the Perfectly Blunt cloning Kit (Novagen, Madison, WI). 
The ligation reaction was transformed into NovaBlue chemically, competent cells 
(Novagen, Madison, WI) that were spread on LB agar plates supplemented with 50 
jig/mL carbenicillin, 40 |ig/mL IPTG, and 40 \ig/mL X-Gal. White colonies were isolated 

1 0 and screened for the presence of inserts by restriction mapping. Isolates with the correct 
restriction pattern were sequenced from each end using the primers pETBlueUP and 
pETBlueDOWN {Novagen) to confirm the sequence at the ligation points. 

The plasmid was transformed into Tuner (DE3) pLacI chemically competent cells 
(Novagen, Madison, WI), and expression from the construct tested. Briefly, a culture was 

1 5 grown overnight to saturation and diluted 1 :20 the following morning in fresh LB 

medium with the appropriate antibiotics. The culture was grown at 37°C with aeration to 
an ODeoo of about 0.6. The culture was induced with IPTG at a final concentration of 100 
jiM. The culture was incubated for an additional two hours at 37°C with aeration. 
Aliquots were taken pre-induction and 2 hours post-induction for SDS-PAGE analysis. A 

20 band of die expected molecular weight (55,653 Daltons predicted from the sequence) was 
observed after IPTG treatment This band was not observed in cells containing a plasmid 
lacking the nucleic acid encoding the transferase. 

Cell free extracts were prepared to assess enzymatic activity. Briefly, the cells 
were harvested by centrifugation and disrupted by sonication. The sonicated cell 

25 suspension was centrifuged to remove cell debris, and the supernatant was used in the 
assays. 

Transferase activity was measured in the following assay. The assay mixture used 
contained 100 mM potassium phosphate buffer <pH 7.0), 200 mM sodium acetate, 1 mM 
dithiobisnitrobenzoate PTNB), 500 nM oxaloacetate, 25 mM CoA-ester substrate, and 3 
30 jig/mL citrate synthase. If present, the CoA transferase transfers the CoA from the CoA 
ester to acetate to form acetyl-CoA. The added citrate synthase condenses oxaloacetate 
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and acetyl-Co A to form citrate and free Co ASH. The free CoASH complexes with 
DTNB, and the formation of this complex can be measured by a change in the optical 
density at 412 nm. The activity of the CoA transferase was measured using the following 
substrates: lactyl-CoA, propionyl-CoA, acrylyl-CoA, and 3-hydroxypropionyl-CoA, The 
5 units/mg of protein was calculated using the following formula: 

(AE/min * V f * dilution factor)/ (V s * 14.2) = units/mL 

V 

where AE/min is the change in absorbance per minute at 412 nm, Vf is the final volume of 
the reaction, and Vs is the volume of sample added. The total protein concentration of the 
10 cell free extract was about 1 mg/mL so the units/mL equals units/mg. 

Cell free extracts from cells containing nucleic acid encoding the CoA transferase 
exhibited ^6A t^fersae activi^(^bJe~2). The observed CoA transferase activity was 
detected for the lactyl-CoA, propionyl-CoA, acrylyl-CoA, and 3-hydroxypropionyl-CoA 
— sHbstrates"(TabIe 2). The highest CoA tran^as^^ laetyl-CoA 
15 and propionyl-CoA. 



Table 2 






Substrate 


Units/mg 




Lactyl-CoA 


211 




Propionyl-CoA 


144 1 




Acrylyl-CoA 


118 




3-Hydroxypropionyl-CoA 


110 



The following assay was performed to test whether the CoA transferase activity 
can use the same CoA substrate donors as recipients. Specifically, CoA transferase 
20 activity was assessed using a Matrix-assisted Laser Desorption/Ionization Time of Flight 
Mass Spectrometry (MALDI-TOF MS) Voyager RP workstation (PerSeptive 
Biosystems). The following five reactions were analyzed: 

1) acetate + lactyl-CoA -> lactate + acetyl-CoA 

2) acetate + propionyl-CoA -> propionate + acetyl-"CoA 
25 3) lactate + acetyl-CoA acetate + lactyl-CoA 

4) lactate + acrylyl-CoA -> acrylate + lactyl-CoA 
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5) 3-hydroxypropionate + lactyl-CoA -> lactate + 3-hydroxypropionyl-CoA 

MALDI-TOF MS was used to measure simultaneously the appearance of the 
product Co A ester and the disappearance of the donor CpA ester. The assay buffer 

5 contained 50 mM potassium phosphate (pH 7.0), 1 mM CoA ester, and 100 mM 
respective acid salt. Protein from a cell free extract prepared as described above was 
added to a final concentration of 0.005 mg/mL. A control reaction was prepared from a 
cell free extract prepared from cells lacking the construct containing the CoA transferase- 
encoding nucleic acid. For each reaction, the cell free extract was added last to start the 

10 reaction. Reactions were allowed to proceed at room temperature and were stopped by 
adding 1 volume 10% trifluroacetic acid (TFA). The reaction mixtures were purified 
prior to MALDI-TOF MS. analysis using Sep Pak Vac Cig 50 mg columns (Waters, Inc.). 
The columns were conditioned with 1 mL methanol and equilibrated with two washes of 
1 mL 0.1% TFA. Each sample was applied to the column, and the flow through was 

1 5 discarded. The column was washed twice with 1 mL 0. 1% TFA. The sample was eluted 
in 200 (iL 40% acetonitrile, 0.1% TFA. The acetonitrile was removed by centrifugation 
in vacuo. Samples were prepared for MALDI-TOF MS analysis by mixing 1:1 with 1 10 
mM sinapinic acid in 0.1% TFA, 67% acetonitrile. The samples were allowed to air dry. 
- In reaction #1, the control sample exhibited a main peak at a molecular weight 

20 corresponding to lactyl-CoA (MW 841). There was a minor peak at the molecular weight 
corresponding to acetyl-CoA (MW 81 1). This minor peak was determined to be the left- 
over acetyl-CoA from the synthesis of lactyl-CoA. The reaction #1 sample containing the 
cell extract from cells transfected with the CoA transferase=encoding plasmid exhibited 
complete conversion of lactyl-CoA to acetyl-CoA. No peak was observed for lactyl-CoA. 

25 This result indicates that the CoA transferase activity can transfer CoA from lactyl-CoA 
to acetate to form acetyl-CoA. 

In reaction #2, the control sample exhibited a dominant peak at a molecular 
weight corresponding to propionyl-CoA (MW 825). The reaction #2 sample containing 
the cell extract from cells transfected with the CoA transferase-encoding plasmid 

30 exhibited a dominant peak at a molecular weight corresponding to acetyl-Co A (MW 811). 
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No peak was observed for propionyl-CoA. This result indicates that the Co A transferase 
activity can transfer CoA from propionyl-CoA to acetate to form acetyl-CoA. 

In reaction #3, the control sample exhibited a dominant peak at a molecular 
weight corresponding to acetyl-CoA (MW 811). The reaction #3 sample containing the 
5 cell extract from cells transfected with the CoA transferase-encoding plasmid exhibited a 
peak corresponding to lactyl-CoA (MW 841). The peak corresponding to acetyl-CoA did 
not disappear. In fact, the ratio of the size of the two peaks was about 1:1. The observed 
appearance of the peak conesponding to lactyl-CoA demonstrates that the CoA 
transferase activity catalyzes reaction #3. 
10 In reaction #4, the control sample exhibited a dominant peak at a molecular 

weight corresponding to acrylyl-CoA (MW 823). The reaction #4 sample containing the 
cell extract from cells transfected with the CoA transferase-encoding plasmid exhibited a 
dominant peak corresponding to lactyl-CoA (MW 841). This result demonstrates that the 
CoA transferase activity catalyzes reaction #4. 
1 5 in reaction #5, deuterated lactyl-CoA was used to detect the transfer of CoA from 

lactate to 3-hydroxypropionate since lactic acid and 3-HP have the same molecular 
weight as do their respective CoA esters. Using deuterated lactyl-CoA allowed for the 
differentiation between lactyl-CoA and 3-hydroxypropionate using MALDI-TOF MS. 
The control sample exhibited a diffuse group_of peaks at molecular weights ranging from 
20 MW 841 to 845 due to the varying amounts of hydrogen atoms that were replaced with 
deuterium atoms. In addition, a significant peak was observed at a molecular weight 
corresponding to acetyl-CoA (MW 811). Ttoj^.was drtejmihied to be the left-over 
acetyl-CoA from the synthesis of lactyl-CoA. The reaction #5 sample containing the cell 
extract from cells transfected with the CoA transferase-encoding plasmid exhibited a 
25 dominant peak at a molecular weight corresponding to 3-hydroxypropionyl-CoA (MW 
841) as opposed to a group of peaks ranging from MW 841 to 845. This result 
demonstrates that the CoA transferase catalyzes reaction #5. 



•56 



WO 02/4241! 




PCT/US01/43607 



Example 2 - Cloning nucleic acid molecules that encode a 
multiple polypeptide complex having lactyl-CoA dehydratase activity 
The following methods were used to clone an El activator polypeptide. Briefly, 
four degenerate forward and five degenerate reverse PCR primers were designed based on 

5 conserved sequences of El activator protein homologs<ElFl 5'- GCWACBGGY- 
TAYGGYCG-3*, SEQ ID NO:60; E1F2 5'-GTYRTYGAYRTYGGYGGYCAGGA-3 \ 
SEQ ID NO:61; E1F3 5 • -ATGAACGA YAARTG YGC WGC WGG-3 * , SEQ ID NO:€2; 
E1F4 5-TGYGCWGGWGGYACBGGYCGYTT-3\ SEQ ID NO:63; E1R1 5'-TCCT- 
GRCCRCCRA YRTCRA YRAC-3 ' , SEQ ID NO:64; E1R2 S'-CCWGCWGCRCAY- 

10 TTRTCGTTCAT-3 9 , SEQ ID NO:65; E1R3 S'-AARCGRCCVG'mCCWGCWG-CRCA- 
3\ SEQ ID NO:66; E1R4 5'- GCTTCGSWTTCRACRATGSW-3\ SEQ ID NO:67; and 
El R5 5 ' -GS WRATRACTJCGCWTTCWGCRAA-3 • , SEQ ID NO:68). 

The primers were used in all logical combinations in PCR using Taq polymerase 
(Roche Molecular Biochemicals, Indianapolis, IN) and-1 ng of genomic DNA per pL 

1 5 reaction mix. PCR was conducted using a touchdown PCR program with 4 cycles at an 
annealing temperature of 60°C, 4 cycles at 58°C, 4 cycles at 56°C, and 18 cycles at 54°C. 
Each cycle used an initial 30-second denaturing step at 94°C and a 3 minute extension 
step at 72°C. The program had an initial denaturing step for 2 minutes at 94°C and a final 
extension step of 4 minutes at 72°C. Time allowed for annealing was 45 seconds. The 

20 amounts of PCR primer used in the reactions were increased 2-10 fold above typical PCR 
amounts depending on the amount of degeneracy in the 3* end of the primer. In addition, 
separate PCR reactions containing each individual primer were made to identify PCR 
product resulting from single degenerate primers. Each PCR product (25 pL) was 
separated by electrophoresis using a 1% TAE (Tris-acetate-EDTA) agarose gel. 

25 The E1F2-E1R4, E1F2-E1R5, E1F3 : E1R4, E1F3-E1R5, and E1F4-E1R4R2 

combinations produced a band of 195, 207, 144, 156, and 144 bp, respectively. These 
bands matched the expected size based on El activator sequences from other species. No 
band was visible with individual primer control reactions. The E1F2-E1R5 fragment 
(207 bp) was isolated and purified using Qiagen Gel Extraction procedure (Qiagenlnc, 

30 Valencia, CA). The purified band (4 |iL) was ligated into a pCRII vector that then was 
transformed into TOP10 E. coli<x\\s by heat-shock using a TOPO cloning procedure 
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(fovitrogen, Carlsbad, CA). Transformations were plated on LB media containing 100 
pg/mL of ampicillin (Amp) and 50 ug/mL of 5-Bromo-4-Chloro-3-Indolyl-B-D- 
Galactopyranpside (X-gal). Single, white colonies were plated onto fresh media and 
screened in a PCR reaction using the El F2 and E1R5 primers to confirm the presence of 
5 the insert. Plasmid DNA was obtained from multiple colonies using a QiaPrep Spin 
Miniprep Kit(Qiagen, Inc). Once obtained, the plasmid DNA was quantified and used 
for DNA sequencing with M13R and M13F primers. Sequence* analysis revealed a 
nucleic acid sequence encoding a polypeptide and revealed that the E1F2-E1R5 fragment 
shared sequence similarity with El activator sequences (Figures 12-13). 
10 Genome walking was performed to obtain the complete coding sequence of E2o 

and p subunits. Briefly, four primers for performing genome walking in both upstream 
and downstream directions were designed using the portion of the 207 bp E1F2-E1R5 
fragment sequence that was internal to the E1F2 and E1R5 degenerate primers (E1GSP1F 
5%ACGTCATGTCGAAGGTACTGGAAATCC-3\ SEQ IDNO:69; E1GSP2F 5'- 
15 GGGACTGGTACTTCAAATCGAAGCATC-3\ SEQ ID NO:70; E1GSP1R 5'- 
TGACGGCAGCGGGATGCTTCGATTTGA-3 \ SEQ ID NO:71; and E1GSP2R 5'- 
TC AGAC ATGGGGATTTCCAGTACCTTC-3 ' , SEQ ID NO:72). The E1GSP1F and 
E1GSP2F primers face downstream, while Ihe E1GSP1R and El GSP2R primers face 
upstream. In addition, the E1GSP2F and E1GSP2R primers are nested inside the 
20 E1GSP1F and El GSP1R primers. 

Genome walking was performed using the Universal Genome "Walking Kit 
(ClonTech Laboratories, Inc^ Palo AJto, CA) witfr the^xception .H**?™ 58 
were generated with enzymes Nru I, Sea I, and Hinc H. First round PCR was performed 
in a Perkin Elmer 2400 Thermocycler with 7 cycles of 2 seconds at 94°C and 3 rninutes at 
25 72°C, and 36 cycles of 2 seconds at 94°C and 3 minutes at 65°C with a final extension at 
65°C for 4 minutes. Second round PCR used 5 cycles of 2 seconds at 94°C and 3 minutes 
at 72°C, and 20 cycles of 2 seconds at 94°C and 3 minutes at 65°C with a final extension 
at 65°C for 4 minutes. The first and second round product (20 uL) was separated by 
electrophoresis using 1% TAE agarose gel. Amplification products were obtained with 
30 the Stu I library for both forward and reverse directions. The second round product of 
about 1.5 kb for forward direction and 3 kb fragment for reverse direction from the 5ft/ 1 



58 



WO 02/42418^ W PCT/US01/43607 

library v/ere gel purified, cloned, and sequenced. Sequence analysis revealed that the 
sequence derived from genome walking overlapped with the E1F2-E1R5 fragment 

To obtain additional sequence, a second genome walk was performed using a first 
round primer (E1GSPF5 5 , <:CGTGTTACTTGGGAAGGTATCGCTGTCTG-3 , t SEQ 

5 ID NO:73) and a second round primer (E1GSPF6 5 '-GCCAATGAAGGAGGAAA- 
CCACTAATGAGTC-3', SEQ ID NO:74). The genome walk was performed using the 
Nrul, Seal, and HincTL libraries. In addition, ClonTech's Advantage-Genomic 
Polymerase was used for the PGR- First round PCR was performed in a Perkin Elmer 
2400 Thermocycler with an initial denaturing step at 94°C for 2 minutes, 7 cycles of 2 

10 seconds at 94°C and 3 minutes at 72°C, and 36 cycles of 2 seconds at 94°C and 3 minutes 
at 65°C with a final extension at 65°C for 4 minutes. Second round PCR used 5 cycles of 
2 seconds at 94°C and 3 minutes at 72°C, and 20 cycles of 2 seconds at 94°C and 3 
minutes at 65°C with a final extension at 65°C for 4 minutes. The first and second round 
product (20 fiL) was separated by electrophoresis on a 1% agarose gel. An about 1.5 kb 

1 5 amplification product was obtained from second round PCR of the HincTL library. This 
band was gel purified, cloned, and sequenced. Sequence analysis revealed that it 
overlapped with the previously obtained genome walk fragment In addition, sequence 
analysis revealed a nucleic acid sequence encoding an E2 a subunit that shares sequence 
similarities with other sequences (Figures 16-17). Further, sequence analysis revealed a 

20 nucleic acid sequence encoding an E2 P subunit that shoes sequence similarities with 
other sequences (Figures 20-21). 

Additional PCR and sequence analysis revealed the order of polypeptide encoding 
sequences within the region containing the lactyl-CoA dehydratase-encoding sequences. 
Specifically, the E1GSP1F and COAGSP1R primer pair and the COAGSP1F and 

25 El GSP1R primer pair were used to amplify fragments that encode both the CoA 

transferase and El activator polypeptides. Briefly, M elsdenii genome DNA (1 ng) was 
used as a template. The PCR was conducted in Perkin Elmer 2400 Thermocycler using 
Long Template Polymerase (Roche Molecular Biochemicals, Indianapolis, IN)- The 
PCR program used was as follows: 94°C for 2 minutes; 29 cycles of 94°C for 30 seconds, 

30 61 °C for 45 seconds, and 72°C for 6 minutes; and a final extension of 72°C for 10 
minutes. Both PCR products (20 pL) were separated on a 1% agarose gel. An 
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amplification product (about 1 .5 kb) was obtained using the COAGSPIF and E1GSP1R 
primer pair. This product was gel purified, cloned, and sequenced (Figure 22). 

The organization of the M. elsdenii operon containing the lactyl-CoA dehydratase- 
encoding sequences was determined to containing the following polypeptide-encoding 

5 sequences in the following order CoA transferase (Figure 6), ORFX (Figure 23), El 
activator protein of lactyl-CoA dehydratase (Figure 10), E2 o subunit of lactyl-CoA 
dehydratase (Figure 14), E2 p subunit of lactyl-CoA dehydratase (Figure 18), and 
truncated CoA dehydrogenase (Figure 25). 

The lactyl-CoA dehydratase Qactyl-CoA dehydratase or led) from M elsdenii was 

10 PCR amplified from chromosomal DNA using the following program: 94°C for 2 

minutes; 7 cycles of 94°C for 30 seconds, 47°C for 45 seconds, and 72°C for 3 minutes; 
25 cycles of 94°C for 30 seconds, 54°C for 45 seconds, and 72°C for 3 minutes; and 72°C 
for 7 minutes. One primer pair was used (OSNBE1F 5'-GGGAATTCCATATG- 
AAAACTGTGTATACTCTC-3', SEQ ID NO:75 and OSNBE1R 5'-CGACGGAT- 

15 CCTTAGAGGATTTCCGAGAAAGC-3 ', SEQ ID NO:76). The amplified product 
(about 3 .2 kb) was separated on 1% agarose gel, cut from the gel, and purified with a 
Qiagen Gel Extraction kit (Qiagen, Valencia, CA). The purified product was digested 
with Nde I and BamM restriction enzymes and ligated into pETl la vector (Novagen) 
digested with the same enzymes. The ligation reaction was transformed into NovaBlue 

20 chemically competent cells (Novagen) that men were spread on LB agar plates 

supplemented with 50ug/mL carbenicillin. Isolated individual colonies were screened 
for the presence of inserts by restriction mapping. Isolates with the correct restriction 
pattern were sequenced from each end using Novagen primers (T7 promoter primer 
#69348-3 and T7 tenninator primer #69337-3) to confirm the sequence at the ligation 

25 points. 

A plasmid having the correct insert was transformed into Tuner (DE3) pLacI 
chemically competent cells (Novagen, Madison, WI). Expression from this construct was 
tested as follows. A culture was grown overnight to saturation and diluted 1 :20 the 
following morning in fresh LB medium with the appropriate antibiotics. The culture was 
30 grown at 37°C with aeration to an OD«x> of about 0.6. The culture was induced with 

IPTG at a final concentration of 100 uM. The culture was incubated for an additional two 
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hours at 37°C with aeration. Aliquots were taken pre-induction and 2 hours post- 
induction for SDS-PAGE analysis. Bands of the expected molecular weight (27,024 
Daltons for the El subunit, 48,088 Daltons for the E2 a subunit, and 42,517 Daltons for 
the E2 p subunit — all predicted from the sequence) were observed. These bands were not 

5 observed in cells containing a plasmid lacking the nucleic acid encoding the three 
components of the lactyl-CoA dehydratase. 

Cell free extracts were prepared by growing cells in a sealed serum bottle 
overnight at 3TO: Following overnight growth, the cultures were induced with 1 mM 
EPTG (added using anaerobic technique) and incubated an additional 2 hours at 37°C. The 

10 cells were harvested by centrifugation and disrupted by sonication under strict anaerobic 
conditions. The sonicated cell suspension was centrifuged to remove cell debris, and the 
supernatant was used in the assays. The buffer used for cell resuspension/sonication was 
50 mM Tris-HCl (pH 7.5), 200 \M ATP, 7 mM Mg(S0 4 ), 4 mM DTT, 1 mM dithionite, 
and 100 pM NADH. 

15 Dehydratase activity was detected with MALDI-TOF MS. The assay was 

conducted in the same buffer as above with 1 mM lactyl-CoA or 1 mM acrylyl-CoA 
added and about 5 mg/mL cell free extract Prior to MALDI-TOF MS analysis, samples 
were purified using Sep Pak Vac Cis columns (Waters, Inc.) as described in Example 1 . 
The following two reactions were analyzed: 

20 1) acrylyl-CoA -> lactyl-CoA 

2)lactyl-CoA -» acrylyl-CoA 

~~ In reaction #1 , the controF sample exhibited a peak at a molecular weight 
corresponding to acrylyl-CoA (MW 823). The reaction #1 sample containing the cell 
25 extract from cells transfected with the dehydratase-encoding plasmid exhibited a major 
peak at a molecular weight corresponding to lactyl-CoA (MW 841). This result indicates 
that the dehydratase activity can convert acrylyl-CoA into lactyl-CoA. 

To detect dehydratase activity on lactyl-CoA, reaction #2 was carried out in 80% 
D2O. The control sample exhibited a peak at a molecular weight corresponding to lactyl- 
30 CoA <MW 841). The reaction #2 sample containing the cell extract from cells transfected 
with the dehydratase-encoding plasmid revealed a lactyl-CoA peak shifted to a deuterated 
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fonn. This result indicates that the dehydratase^nzyme is active on lactyl-CoA. In 
addition, the results from both reactions indicate that the dehydratase enzyme can 
catalyze the lactyl-CoA acrylyl-CoA reaction in both directions. 

5 Example 3 - Ooning nucleic acid molecules that encode 

a polypeptide having 3-hydroxvpropionvl CoA dehydratase activity 
Genomic DNA was isolated from Chlorqflexus aurantiacus cells (ATCC 29365). 
Briefly, C. aurantiacus cells in 920 Chloroflexus medium were grown in 50 mL cultures 
(Falcon 2070 polypropylene tubes) using an Innova 4230 Incubator, Shaker {New 

10 Brunswick Scientific; Edison, NJ) at 50°C with interior lights. Once grown, the cells 
were pelleted, washed with 5 mL of a 10 mM Tris solution, and re-pelleted. Genomic 
DNA was isolated from the pelleted cells using a Gentra Genomic Turegene" DNA 
isolation kit (Gentra Systems; Minneapolis, MN). Briefly, the pelleted cells were 
resuspended-in-l-mL GentraXeUSuspensionSolutionto which .1.4.2 rngof lysozyme and 

15 4 uL of 20 mg/mL proteinase K solution was added. The cell suspension was incubated 
at 37°C for 30 minutes. The precipitated genomic DNA was recovered by centrifugation 
at 3500 x g for 25 minutes and air-dried for 10 minutes. The genomic DNA was 
suspended in 300 uL of a 10 mM Tris solution and stored at 4°C. 

. The genomic DNA.was used as a: template.in PGR amplification reactions with' 

20 primers designed based on conserved domains of crotonase homologs and a Chlorqflexus 
aurantiacus codon usage table. Briefly, two degenerate forward (CRF1 and CRF2) and 
three degenerate reverse (CRR1, CRR2, and CRR3) PCR primers were designed (CRF1 
*5'-AAYCGBCCVAARGCNCTSAAYGC-3', SEQ IDNO:77; CRF2: 5'- 
TTYGTBGCNGGYGCNGAYAT-3', SEQ ID NO:78; CRR1 5'-ATRTCNG- 

25 CRCCNGCVACRAA-3*. SEQ ID NO:79; CRR2 5'-CCRCCRCCSAGNG- 

CRWARCCRTT-3', SEQ IDNO:80; andCRR3 5'-SSWNGCRATVCGRATRTCRAC- 
3*,SEQIDNO:81). 

These primers were used in all logical combinations in PCR using Taq polymerase 
(Roche Molecular Biochemicals; Indianapolis, IN) and 1 ng of the genomic DNA per uL 
30 reaction mix. The PCR was conducted using a touchdown PCR program with 4 cycles at 
an annealing temperature of 61°C, 4 cycles at"59°C, 4 cycles at 57°C, 4 cycles at 55°C, 
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and 16 cycles at 52°C. Each cycle used an initial 30-second denaturing step at 94°C and 
a 3-minute extension step at 72°C. The program also had an initial denaturing step for 2 
minutes at 94°C and a final extension step of 4 minutes at 72°C. The time allowed for 
annealing was 45 seconds. The amounts of PCR primer used in the reaction were 

5 increased 4-12 fold above typical PCR amounts depending on the amount of degeneracy 
in the 3' end of the primer. In addition, separate PCR reactions containing each 
individual primer were performed to identify amplification products resulting from single 
degenerate primers. Each PCR product (25 nL) was separated by gel electrophoresis 
using a 1% TAE (Tris-aeetate-EDTA) agarose gel. 

1 0 The CRF1 -CRR1 and CRF2-CRR2 combinations produced a unique band of 

about 120 and about 150 bp, respectively. These bands matched the expected size based 
on crotonase genes from other species. No 120 bp or 150 bp band was observed from 
individual primer control reactions. Both fragments.(i.e., the 120 bp and 150 bp bands) 
were isolated and purified using the Qiagen Gel Extraction kit (Qiagen Inc., Valencia, 

15 CA). Each purified fragment <4 jiL) was ligated into pCRII vector that then was 

transformed into TOP 10 E. coli cells by a heat-shock method using a TORO cloning 
procedure (Invitrogen, Carlsbad, CA). Transformations were plated on LB media 
containing 100 ng/mL of ampiciilin (Amp) and 50 ng/mL of 5-Bromo-4-Chloro-3- 
Indolyl-5-D-Galactopyranoside (X-gal). Single, white colonies were plated onto fresh 

20 media and screened in a PCR reaction using the CRF1 and CRR1 primers and the CRF2 
and CRR2 primers to confirm the presence of the desired insert Plasmid DNA was 
obtained from multiple colonies with the desired insert using a QiaPrep Spin Miniprep 
Kit (Qiagen, Inc.). Once obtained, the DNA was quantified and used for DNA 
sequencing with M13R and M13F primers. Sequence analysis revealed the presence of 

25 two different clones from the PCR product of about 150 bp. Each shared sequence 
similarity with crotonase and hydratase sequences. The two clones were designated 
OS17 (157 bp PCR product) and OS19 (151 bp PCR product). 

Genome walking was performed to obtain the complete coding sequence of OS 17. 
Briefly, primers for conducting genome walking in both upstream and downstream 

30 directions were designed using the portion of the 157 bp CRF2-CRR2 fragment sequence 
that was internal to the CRF2 and CRR2 degenerate primers<OS17Fl 5*-CGCTG- 

-63 



BNSOOCID: <WO 024241 8A2J_> 



WO 02/42418 



PCT/US01/43607 



ATATTCGCCAGTTGCTCGAAG-3\ SEQ IDNO:82; OS17F2 5'-CCCATCTTG- 
CTTTCCGCAAG ATTGAGC-3 ' , SEQ ID N0.83; OS17F3 5'-CAATGGCCCTGCCGA- 
ATAACGCCC ATCT-3 ' , SEQ IDNO:84; OS17R1 5 ' -CTTCG AGC AACTGGCG AA- 
TATCAGCG-3*, SEQ ID NO:85; OS17R2 5'-GCTCAATCTTGCGGAAAGCAAG- 
5 ATGGG-3', SEQ ED NO:86; and OS17R3 5 ' - AG ATGGGCX3TTATTCGGC AGGGCC- 
ATTG-3', SEQ ID NO:87). The 0S17F1, OS17F3, and OS17F2 primers face 
downstreamrwhile the OS17R2i OS17R3rand OS17R1 primers face upstream. 

Genome walking was conducted using the Universal Genome Walking kit 
<ClonTech Laboratories, Inc., Palo Alto, CA) with the exception that additional libraries 
10 were generated with enzymes Nru I, Fsp I, and Hinc U. The first round PCR was 

conducted in a Perkin Elmer 2400 Thermocycler with 7 cycles of 2 seconds at 94°C and 3 
nunutes-at72°e, ana-3^cyctesuf-2 seconds?* 94*eand 3 minutes at*6°e with a final 
extension at 66°C for 4 minutes. Second round PCR used 5 cycles of 2 seconds at 94°C 
-and-^Binutes-at-72gGrand^0-cy cles of 2 s ec onds^94?Cand3-mi»rtes^66X with a 
15 final extension at 66°C for 4 minutes. The first and second round amplification product 
(5 jiL) was separated by gel electrophoresis on a 1% TAE agarose gel. After the second 
round PCR, an amplification product of about 0.4 kb was obtained with the Fsp I library 
using the OS1 7R1 primer in the reverse direction, and an amplification product of about 
0.6 kb vffl° o bt a ined ^ ith the TT library minp; .the-OS 17F 2 pri r ne r-iaihe forward 
20 direction. These PCR products were cloned and sequenced. 

Sequence analysis revealed that the sequences derived from genome walking 
overlapped with the CRF2-CRR2 fragment and shared sequence similarity with crotonase 
and hydrolase sequences. 

A second genome walking was performed to obtain additional sequences. Six 
25 primers were designed for this second genome walk (OS17F4 5'-AAGCTGGG- 

TCTGATCGATGCCATTGCTACC-3', SEQ ID NO:88; OS17F5 5*-CTCGATTATCG- 
CCCATCCACGTATCGAG-3', SEQ ED NO:89; OS17F65'-TGGATGCAATCCG- 
CTATGGCATTATCCACG-3', SEQ ED NO:90; OS17R4 5'-TCATTCAGTGCG- 
TTCACCGGGGGATTTGTC-3 ' , SEQ ID NO:91;OS17R5 5 '-TCGATCCGGAAGT- 
30 AGCGATAGCGTTCGATG-3 ' , SEQ ID NO:92; and OS 17R6 5 ' -CTTGGCTGC AAT- 
CTCTTCGAGC ACTTCAGG-3 ' , SEQ ED NO:93). The OS17F4, OS17F5, and OS17F6 
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primers faced downstream, while the OS17R4, OS17R5, and OS17R6 primers faced 
upstream. 

The second genome walk was performed using the same methods described for 
the first genome walk. After the second round of walking, an amplification product of 

5 about 2.3 kb was obtained with a Hinc II library using the OS 1 7R5 primer in the reverse 
direction, and an amplification product of aboutU6 kb was obtained with a Pvu II library 
using the OS17F5 primer in the forward direction. The PCR prpducts were cloned and 
sequenced. Sequence analysis revealed that the sequences derived from the second 
genome walking overlapped with the sequence obtained during the first genome walking. 

10 In addition, the sequence analysis revealed a sequence with 3572 bp. 

A BLAST search revealed that the polypeptide encoded by this sequence shares 
sequence similarity with polypeptides having three different activities. Specifically, the 
beginning of the OS17 encoded-polypeptide shares-sequence similarity with CoA- 
synthesases, the middle region of the OS 17 encoded-polypeptide shares sequence 

15 similarity with enoyl-CoA hydratases, and the end region of the OS17 encoded- 
polypeptide shares sequence similarity with CoA-reductases. 

A third genome walk was performed using four primers (OS17UP-6 5'- 
CATCAGAGGTAATCACCACTCGTGCA-3\ SEQ ID NO:94; OS17UP-7 5*- 
AAGTAGTAGGCCACCTCGTCGCCATA-3 \ SEQ ID-N0:?5; 0S17DN-1 5'- 

20 GCCAATCAGGCGCTGATCTATGTTCT-3 \ SEQ ID NO:96; and OS17DN-2 5*- 
CTGATCTATGTTCTGGCCTCGGAGGT-3\ SEQ ID NO:97). The OS17UP-6and 
OS17UP-7 primers face upstream, while the OS17DN-1 and OS17DN-2 primers face 
downstream. The third genome walk yielded an amplification product of about 1.2 kb 
with aATrw I library using the OS17UP-7 primer in the reverse direction. In addition, 

25 amplification products of about 4 kb and about 1.1 kb were obtained with a Hinc U and 
Fsp I library, respectively, using theOS17DN-2 primer in the forward direction. 
Sequence analysis revealed a nucleic acid sequence encoding a polypeptide (Figures 27- 
28). The complete OS 17 gene had^466 nucleotides and encoded a 1822 amino acid 
polypeptide. The calculated molecular weight of the OS 1 7 polypeptide from the 

30 sequence was 201,346 <pl=5.71). 
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A BLAST search analysis revealed that the product of the OS17 nucleic acid has 
three different activities based on sequence similarity to (1) CoA-synthesases at the 
beginning of the OS17 sequence, (2) 3-HP dehydratases in the middle of the OS17 
sequence, and (3) CoA-reductases at the end oftheOS17 sequence. Thus, the t)S17 
5 clone appeared to encode a single enzyme capable of catalyzing three distinct reactions 
leading to the direct conversion of 3-hydroxypropionate to propionyl CoA: 3-HP-* 3-HP- 
CoA-» acrylyl-CoA^propionyl-CoA 

TheOS17 gene from C. aurantiacus was PCR amplified from chromosomal DMA 
using the following conditions: 94°C for 3 minutes; 25 cycles of 94°C for 30 seconds to 
10 denature, 54°C for 30 seconds to anneal, and 68°C for 6 minutes for extension; followed 
by 68°C for 10 minutes for final extension- Two primers were used (OS17F 5'- 
GGGAATTCCATATGATCGACACTGCG-3', SEQ ID NO:136; and OS17R S*- 
CGAAGGATCCAACGATAATCGGCTCAGCAC-3', SEQ ID NO:137). The resulting 
PCR product {-5.6 Kb) was purified using Qiagen PCR purification kit (Qiagen Inc., 
15 Valencia, CA). The purified product was digested with Ndel and BamHI restriction 

enzymes, heated at 80°C for 20 minutes to inactivate the enzymes; purified using Qiagen 
PCR purification kit, and ligated into a pETl la vector (Novagen, Madison, WT) 
previously digested with Ndel and BamHI enzymes. The ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen, Madison, WI) that 
20 were spread on LB agar plates supplemented with 50 ug/mL carbenicillin. Individual 
transformants were screened by PCR amplification of the OS17 DNA with theOS17F 
and OS17R primers and conditions as described above directly from colonies cells. 
Clones that yielded the 5.6 Kb product were used for plasmid purification with Qiagen 
QiaPrep Spin Miniprep Kit (Qiagen, Inc). Resulting plasmids were transformed into K 
25 coli BL2l(DE3) cells, and OS17 polypeptide expression induced. The apparent 

molecular weight of the OS17 polypeptide according toSDS gel electrophoresis was 
about 190,000 Da. 

To assay OS17 polypeptide function, a 100 mL culture of BL21-DE3/pETl la- 
OS 17 cells was started using 1 mL of overnight grown culture as an inoculum. The 
30 culture was grown to an OD of 0.5-0.6 and was induced with 100 uM IPTG. After two 
and a half hours of induction, the cells were harvested by spinning at 8000 rpm in the 
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floor centrifuge. The cells were washed with 10 raM Tris-HClXpH 7.8) and passed twice 
through a French Press at a gauge pressure of 1000 psi. The cell debris was removed by 
centrifiigation at 15,000 rpm. The activity of the OS 17 polypeptide was measured 
spectrophotometrically, and the products formed during this enzymatic transformation 
5 were detected by LC/MS. The assay mix was as follows <J, BacterioL, 181:1 088-1 098 
(1999)): 

Reagent Volume Final Cone. 



Tris-HCl (1000 mM, 7.8 pH) 


10|iL 


50 mM 


MgCl 2 (lOOmM) 


10 pL 


SmM 


ATP(30mM) 


20 pX 


3 mM 


KCl(lOOmM) 


20 pL 


10 mM 


Co ASH (5 mM) 


20 pL 


0.5 mM 


NAD(P)H 


20pL 


• 0.5 mM 


3-hydroxypropionate 


2pL 


1 mM 


Protein extract <7 mg/mL) 


20(40)pL 


140 pg 


DI water 


78(58)pL 




Total 


200 pL 





20 The initial rate of reaction was measured by monitoring the disappearance of 

NAD(P)H at 340 nm. The activity of the OS17 polypeptide was measured using 3-HP as 
the substrate. The units/mL of total protein was calculated using the formula set forth in 
Example 1, The activity of the expressed OS17 polypeptide was calculated to be 0.061 
U/mL of total protein. The reaction products were purified using a Sep Pak Vac column 

25 (Waters). The column was conditioned with 1 mL methanol and washed two times with 
0.5 mL 0.1% TFA. The sample was then applied to the column, and the column was 
washed two more times with 0.5 mL 0.1% TFA. The sample was eluted with 200 pL of 
40% acetonitrile, 0. 1% TFA. The acetonitrile was removed from the sample by vacuum 
centrifiigation. The reaction products were analyzed by LC/MS. 

30 Analyses of thioesters namely propionyl CoA, acrylyl CoA, and 3 HP CoA from 

the above reaction were carried out using a Waters/Micromass ZQ LC/MS instrument 
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which had a Waters 2690 liquid chromatograph with a Waters 996 Photo-Diode Array 
(PDA) placed in series between the chromatograph and the single quadropole'mass 
spectrometer. LC separations were made using a 4.6 x 1 50 mm YMC ods-AQ (3 um 
particles, 120 A pores) reversed-phase chromatography column at room temperature. 
5 CoA esters were eluted in Buffer A (25 mM ammonium acetate, 05% acetic acid) with a 
linear gradient of buffer B <acetonitrile, 0.5% acetic acid). A flow rate of 0.25 mL/minute 
was used, and photodiode array UV absorbance was monitored from 200 to 400 nm. All 
parameters of the electrospray MS system were optimized and selected based on 
generation of protonated molecular ions ((M+H]*) of the analytes of interest and 
10 production of characteristic fragment ions. The following instrumental parameters were 
used for ESI-MS detection of CoA and organic acid-CoA thioesters in the positive ion 
mode; Extractor: 1 V; RF lens: 0 V; Source temperature: 100°C; Desolvation 
temperature: 300°C; Desolvation gas: 500 L/hour, Gone gas: 40 L/hour, Low mass 
resolution: 13.0; High mass resolution: 14.5; Ion energy; 0.5; Multiplier 650. 
15 Uncertainties for mass charge ratios (m/z) and molecular masses are ± 0.01%. 

The enzyme assay mix from strains expressing the OS17 polypeptide exhibited 
peaks for propionyl CoA, acrylyl CoA, and 3-HP CoA with the propionyl CoA peak 
being the dominant peak, these peaks where missing in the enzyme assay mix obtained 
from the control strain, which carried vector pETlla without an insert These results 
20 indicate that the OS17 polypeptide has CoA synthetase activity, CoA hydratase activity, 
and dehydrogenase activity. 

Genome walking also was performed to obtam4he complete coding sequence of 
OS19. Briefly, primers for conducting genome walking in both upstream and 
downstream directions were designed using the portion of the 1 5 1 bp CRF2-CRR2 
25 fragment sequence that was internal to the CRF2 and CRR2 degenerate primers (OS19F1 
5'.GGCTGATATCAAAOCGATGGCCAATGC.3;, SEQ ID NO:98; OS19F2 5'-CCAC- 
GCCTATTGATATGCTCACCAGTG-3', SEQ ID NO:99; OS19F3 5'-GCAAACCGG- 
TGATTGCTGCCGTGAATGG-3', SEQ ID NO:100; OS19R1 5'-GCATTGGCCAT- 
GGCTTTG ATATC AGCC-3 ' , SEQ IDNO:101; OS19R2 S'-CACTGGTGAGCATATC- 
30 AATAGGGGTGG-3 ' , SEQ ID NO:102; and OS19R3 5'-CCATTCACGGCAGCAA- 
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TCACCGGTTTGC-3YSEQ ID NO:103). The 0S19F1, OS19F2, and OS19F3 primers 
face downstream, while the OS19R1, OS19R2, and OS19R3 primers face upstream. 

An amplification product of about 0.25 kb was obtained with the Fsp I library 
using the OS19R1 primer, while an amplification product of about 0.65 kb was obtained 
5 with the Pvu U library using the OS 1 9R1 primer. In addition, an amplification product of 
about 0.4 kb was obtained with the Pvu U library using the OS19F3 primer. The PCR 
products were cloned and sequenced. Sequence analysis revealed that the sequences 
derived from genome walking overlapped with the CRF2-CRR2 fragment and shared 
sequence similarity with crotonase and hydratase sequences. The obtained sequences 

10 accounted for most of the coding sequence including the start codon. 

A second genome walk was performed to obtain additional sequence using two 
primers (OS19F7 5 '-TCATCATCGCCAGTGAAAACGCGCAGTTCG-S SEQ ID 
NO:104 and OS19F8 5 , -GGATCGCGCAAACCATrGCCACCAAATCAC-3\ SEQ ID 
NO:105). The OS19F7 and OS19F8 primers face downstream. 

15 An amplification product (about 0.7 kb) obtained from the Pvu U library was 

cloned and sequenced. Sequence analysis revealed that the sequence derived from the 
second genome walk overlapped with the sequence obtained from the first genome walk 
and contained the stop codon. The full-length OS 19 clone was found to share sequence 
similarity with other sequences such as crotonase and enoyl-Co A hydratase sequences 

20 (Figures 32-33). 

The OS 1 9 clone was found to encode a polypeptide having 3-hydroxypropionyl- 
CoA dehydratase activity also referred to as acrylyl-Co A hydratase activity. The nucleic 
acid encoding the OS 19 dehydratase from C. aurantiacus was PCR amplified from 
chromosomal DNA using the following conditions: 94°C for 3 minutes; 25 cycles of 

25 94°C for 30 seconds to denature, 56°C for 30 seconds to anneal, and 68°C for 1 minute 
for extension; and 68°C for 5 minutes for final extension. Two primers were used 
{OSACH3 5 9 -ATGAGTGAAGAGTCTCTGGTTCTCAGC-3 SEQ ID NO:106 and 
OSACH2 S'-AGATCGCAATCGCTCGTGTATGTC-S', SEQ ID NO:107). 
The resulting PCR product^about 1 .2 kb) was separated by agarose gel 

30 electrophoresis and purified using Qiagen PCR purification kit {Qiagen Inc.; Valencia, 
CA). The purified product was ligated into pETBlue-1 using the Perfectly Blunt cloning 
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Kit (Novagen; Madison, WI). The ligation reaction was transformed into NovaBlue 
chemically competent cells (Novagen, Madison, WI) that then were spread on LB agar 
plates supplemented with SO ug/mL carbenicillin, 40 ug/mL IPTG, and 40 ng/mL X-Gal. 
White colonies were isolated and screened for the presence of inserts by restriction 
5 mapping. Isolates with the correct restriction pattern were sequenced from each end 
using the primer pETBlueUP and pETBlueDO WN (Novagen) to confirm the sequence at 

the ligation points. - 

The plasniid containing the OS19 dehydratase-encoding sequence was 

transformed into Tuner (DE3) pLacI chemically competent cells (Novagen, Madison, 
10 WI), and expression from the construct tested. Briefly, a culture was grown overnight to 
saturation and diluted 1 :20 the following morning in fresh LB medium with the 
appropriate antibiotics;^ culture was grown ar37°C and 250 ipnrto an ODwo of about 
0.6. At this point, the culture was induced with IPTG at a final concentration of 1 mM. 
The-euteuw^as-mcubated^r-a n additional two ho urs-ataTgCaad^SO ipnv Aliquots 
15 were taken pre-induction and 2 hours post-induction for SDS-PAGE analysis. A band of 
the expected molecular weight (27,336 Daltons predicted from the sequence) was 
observed. This band was not observed in cells containing a plasmid lacking the nucleic 
acid encoding the hydratase. 

Cell free extracts were prepared by growing c ells- as de s crib ed a b ove. The cells 

20 were harvested by centrifugation and disrupted by sonication. The sonicated cell 

suspension was centrifuged to remove cell debris, and the supernatant was used in the 
assays. . Thejbility_Qf-the 3-hydroxypropLonyl-CoA dehydratase to perform the following 
three reactions was measured using MALDI-TOF MS: 
1) acrylyl-CoA S-hydroxypropionyKtoA 
25 2) 3-hydroxypropionyl-CoA acrylyl-CoA 

3) crotonyl-CoA -> 3-hydroxybutyryl-CoA 

The assay mixture contained 50 mM Tris-HCl (pH 7.5), 1 mM CoA ester, and 
about 1 ug cell free extract Reactions were allowed to proceed at room temperature and 
30 were stopped by adding 1 volume 10% trifluroacetic acid (TFA). The reaction mixtures 
were purified prior to MALDI-TOF MS analysis using Sep Pak Vac Cm 50 mg columns 
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{Waters, Inc.)- The columns were conditioned with 1 mL methanol and then equilibrated 
with two washes of 1 mL 0.1% TFA. The sample was applied to the column, and the 
flow through was discarded. The column was washed twice with 1 mL 0.1% TFA. The 
sample was eluted in 200 jiL 40% acetonitrile, 0.1% TFA. The acetonitrile was removed 
5 by centrifugation in vacuo. Samples were prepared for MALDI-TOF MS analysis by 
mixing 1:1 with 110 raM sinapinic acid in 0.1% TFA, 67% acetonitrile. The samples 
were allowed to air dry. 

The conversion of acrylyl-CoA into 3-hydroxypropionyl-CoA catalyzed by the 3- 
hydroxypropionyl-CoA dehydratase was detected using the MALDI-TOF MS technique. 
10 In reaction #1 , the control sample exhibited a dominant peak at a molecular weight 
corresponding to acrylyl-CoA (MW 823). The reaction #1 sample containing the cell 
extract from cells transfected with the 3-hydroxypropionyl-CoA dehydratase-encoding 
plasmid exhibited a dominant peak corresponding to 3-hydroxypropionyl-CoA (MW 
841). This result demonstrates that the 3-hydroxypropionyl-CoA dehydratase activity 
15 catalyzes reaction #1 . 

To detect the conversion of 3-hydroxypropionyl-CoA into acrylyl-CoA, reaction 
#2 was carried out in 80% D2O. The reaction #2 sample containing the cell extract from 
cells transfected with the 3-hydroxypropionyl-CoA dehydratase-encoding plasmid 
revealed incorporation of deuterium in the 3-hydroxypropionyi-CoA molecule. This 
20 result indicates that the 3-hydroxypropionyl-CoA dehydratase enzyme catalyzes reaction 
#2. In addition, the results from both #1 and #2 reactions indicate thai the 3- 
hydroxypropionyl-CoA dehydratase enzyme can catalyze the 3-hydroxypropinyl-CoA 

acrylyi-CoA reaction m b62i directions. It is rioted that for both the #1 and #2 
reactions, a peak was observed at MW 811, due to leftover acetyl-Co A from the synthesis 
25 of 3-hydroxypropionyi-CoA from 3-hydroxypropionate and acetyl-CoA. 

The assays assessing conversion of crotonyl-CoA into 3-hydroxybutyryM3oA also 
were carried out in 80% D2O. In reaction #3, the control sample exhibited a dominant 
peak at a molecular weight corresponding to crotonyl-CoA (MW 837). This result 
indicated that the crotonyl-CoA was not converted into other products. The reaction #3 
30 sample containing the cell extract from cells transfected with the 3-hydroxypropionyl- 
CoA dehydratase-encoding plasmid exhibited a diffuse group of peaks corresponding to 
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deuterated3-hydroxybutyryl-CoA(MW855 to MW*57). This result demonstrates that 
the 3-hydroxypropionyl-CoA dehydratase activity catalyzes reaction #3. 

A series of control reactions were performed to confirm the specificity of the 3- 
hydroxypropionyl-CoA dehydratase. Lactyl-CoA (1 mM) was added to the reaction 
5 mixture containing 100 mM Tris (pH 7.0) both in the presence and the absence of the 3- 
hydroxypropionyl-CoA dehydratase. In both cases, the doniinant peak observed had a 
molecular weight corresponding to lactyl-CoA (MW 841). This result indicates that 
lactyl-CoA is not affected by the presence of 3-hydroxypropionyl-CoA dehydratase 
activity even in the presence of D2O meaning that the 3-hydroxypropionyl-CoA 
10 dehydratase enzyme does not attach a hydroxyl group at the alpha carbon position. The 
presence of 3-hydroxypropionyl-CoA in an 80% D 2 0 reaction mixture resulted in a shift 
upon addition of the 3-hydroxypropionyl-CoA dehydratase activity. In the absence of 3- 
hydroxypropionyl-CoA dehydratase activity, a peak corresponding to 3- 
hydroxypropionyl-CoA was observed in addition to a peak of MW 811. The MW 811 
15 peak was due to leftover acetyl-CoA from the synthesis of 3-hydroxypropionyl-CoA. In 
the presence of 3-hydroxypropionyl-CoA dehydratase activity, a peak corresponding to 
deuterated 3-hydroxypropionyl-CoA was observed (MW 842) due to exchange of a 
hydroxyl group during the conversion of 3-hydroxypropionyl-CoA to acrylyl-CoA and 
visa-versa. These control reactions demonstrate that the 3-hydroxypropionyl-CoA 
20 dehydratase enzyme is active on 3-hydroxypropionyl-CoA and not active on lactyl-CoA. 
In addition, these results demonstrate that the product of the acrylyl-CoA reaction is 3- 
hydroxypropionyl-CoA not lactyl-CoA. 

Example 4 - Construction of operon #1 
25 The following operon was constructed and can be used to produce 3 -HP in E. coli 

(Figure 34). Briefly, the operon was cloned into a pET-1 la expression vector under the 
control of a T7 promoter (Novagen, Madison, WI). The pET-1 la expression vector is a 
5677 bp plasmid that uses the ATG sequence of an Ndel restriction site as a start codon 
for inserted downstream sequences. 
30 Nucleic acid molecules encoding aCoA transferase and a lactyl-CoA dehydratase 

: amplified from Megasphaera elsdenii genomic DNA by PCR. Two primers were 



were) 
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used to amplify the CoA transferase-eneoding sequence (OSNBpctF 5*-GGGAATTCC- 
ATATGAGAAAAGTAGAAATCATTACAGCTG-3 SEQ ID NO:108 an4 OSCTE-2 
5 , ^AGAGTATACACAGTTTTCACCTCCTTTACAGCAGAGATO\ SEQ ID 
NO: 1 09), and two primers were used to amplify the lactyl-CoA dehydratase-encoding 

5 sequence (OSCTE-1 S'-ATCTCTGCTGTAAAGGAGGTGAAAACTGTGTATACT- 
CTC-3*, SEQ ID NO:l 10 and OSEBH-2 S'-ACGTTGATCTCCTTGTACATT- 
AG AGGATTTCCGAG AAAGC-3 * , SEQ ID NO:l 1 1). A nucleic acid molecule 
encoding a 3-hydroxypropionyl-CoA dehydratase was amplified from CMorq/lexus 
aurantiacus genomic DNA of by PCR using two primers tOSEBH-1 5 '-GCTTTCTGGG- 

10 AAATCCTCTAATGTACAAGGAGATCAACGT-3 *, SEQ ID NO:l 12 and OSHBR 5'- 
CGACGGATCCTCAACGACCACTGAAGTTGG-3^ SEQ ID NO:l 13). 

PCR was. sonduoi^ juiaj^^ of 

genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
Pfu Turbo polymerase (Stratagene; La Jolla, CA) in 8: h ratio. The po lymerase mix 

1 5 ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 
for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
obtained PCR products were gel purified using a Qiagen Gel Extraction Kit (Qiagen, Inc.; 
Valencia, CA). 

20 The CoA transferase, lactyl-CoA dehydratase (El , E2 a subunft, and E2 0 

subunit), and 3-hydroxypropionyl-CoA dehydratase PCR products were assembled using 
PCR. The OSCTE-1 and OSCTE-2 primers as well as the OSEBH-1 and OSEBH-2 
primers were complementary to each other. Thus, the complementary DNA ends could 
anneal to each other during the PCR reaction extending the DNA in both direction. To 

25 ensure the efficiency of the assembly, two end primers (OSNBpctF and OSHBR) were 
added to the assembly PCR mixture, which contained 100 ng of each PCR product (i.e., 
the PCR products from the CoA-transferase, lactyl-CoA dehydratase, and 3- 
hydroxypropionyl-CoA dehydratase reactions) as well as the rTth polymerase/Pfu Turbo 
polymerase mix described above. The following PCR conditions were used to assemble 

30 the products: 94°C for 1 minute; 25 cycles of 94°C for 30 seconds, 55°C for 30 seconds, 
and 8°C for 6 minutes; and a final extension at 68°Cfor 7 minutes. The assembled PCR 
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product was gel purified and digested with restriction enzymes {Ndel and BamlH). The 
sites for these restriction enzymes were introduced into the assembled PCR product using 
the OSNBpctF (Ndel) and OSHBR (BamHI) primers. The digested PCR product was 
heated at 80°C for 30 minutes to inactive the restriction enzymes and used directly for 
5 ligation into pET-1 la vector. 

The pET-1 l a vector was digested with Ndel and BamHI restriction enzymes, gel 
purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 
(Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
assembled PCR product The ligation was performed at 16°C overnight using T4 ligase 

10 (Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WI) using a 
heat-shock method. Once heat shocked, the cells were plated on LB plates supplemented 
with 50 fig/mL carbenicillin. The plasmid DNA was purified from individual colonies 
using a QiaPrep Spin Miniprep Kit (Qiagen Inc., Valencia, CA) and analyzed by 

15 digestion with Ndel and BamHI restriction enzymes. 

Example 5 - Construction of ooeron #2 
The following operon was constructed Mdcan be used to produce 3 -HP in £ coli 
(Figure 35A and B). Nucleic acid molecules encoding a CoA transferase and a lactyl- 

20 CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by PCR. 
Two primers were used to amplify the CoA transferase -encoding sequence (OSNBpctF 
and OSCTE-2), and two primers were used to amplify the lactyl-CoA dehydratase- 
encoding sequence (OSCTE-1 and OSNBelRS'-CXjACGGATCCTTAGAGGATTT- 
CCGAGAAAGC-3\'SEQIDNO:114). A nucleic acid molecule encoding a 3- 

25 hydroxypropionyl-CoA dehydratase was amplified from Chlorqflexus aurantiacus 
genomic DNA of by PCR using two primers (OSXNhF S'-GGTGTCT- 
AG AGACAGTCCTGTCGTTTATGTAGAAGGAG-3 * , SEQ ID NO:l 15 and OSXNhR 
5'43GGAATTCCATATGCGTAACTTCCTCCTG^ 
GTTGG-3', SEQ IDNOrl 16). 

30 PCR was conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 

genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
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Pfu Turbo polymerase (Stratagene; La Jolla, CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°G 
for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°Cfor5 minutes. The 
5 obtained PCR products were gel purified using a Qiagen Gel Extraction Kit (Qiagen, Inc.; 
Valencia, CA). 

The CoA transferase and lactyl-CoA dehydratase <E1, E2 a subunit, and E2 P 
subunit) PCR products were assembled using PCR. The OSCTE-1 and OSCTE-2 primers 
were complementary to each other. Thus, the 22 nucleotides at the end of the CoA 

10 transferase sequence and the 22 nucleotides at the beginning of the lactyl-CoA 

dehydratase could anneal to each other during the PCR reaction extending the DNA in 
both direction. To ensure the efficiency of the assembly, two end primers (OSNBpctF 
and OSNBelR) were added to the assembly PCR mixture, which contained 100 ng of the 
CoA transferase PCR product, 100 ng of lactyl-CoA dehydratase PCR product, and the 

15 rTth polymerase/Pfu Turbo polymerase mix described above. The following PCR 

conditions were used to assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 
30 seconds, 54°C for 30 seconds, and 68°C for 5 minutes; and a final extension at 68°C 
for 6 minutes. 

The assembled PCR product was gel purified and digested with restriction 
20 enzymes (Ndel and BamHI). The sites for these restriction enzymes were introduced into 
the assembled PCR product using the OSNBpctF (Ndel) and OSNBelR (BamHI) primers. 
The digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pET-1 la vector. 

The pET-1 la vector was digested with Ndel and BamHI restriction enzymes, gel 
25 purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 

{Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
assembled PCR product. The ligation was performed at 16°C overnight using T4 ligase 
{Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WI) using a 
30 heat-shock method. Once heat shocked, the ceils were plated on LB plates supplemented 
with 50 ng/mL carbenicillin. The plasmid DNA was purified from individual colonies 
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using a QiaPrep Spin Miniprep Kit (Qiagen Inc., Valencia, CA) and analyzed by 
digestion with Ndel and BamHI restriction enzymes. The digest revealed that the DNA 
fragment containing CoA transferase-encoding and iactyl-CoA dehydratase-encoding 
sequences was cloned into the pET-1 1 a vector. 
5 The plasmid carrying the CoA transferase-encoding and Iactyl-CoA dehydratase- 

encoding sequences (pTD) was digested with Xbal and Ndel restrictionenzymes,*gel 
purified! ^nd used for cloning-the^3-hydroxypropionyl-GoA dehydratase-encoding 
product upstream of the CoA transferase-encoding sequence. Since this Xbal and Ndel 
digest eliminated a ribosome-binding site (RBS) from the pET-1 la vector, a new 
10 homologous RBS was cloned into the plasmid together with the 3-hydroxypropionyl-CoA 
dehydratase-encoding product. Briefly, the 3-hydroxypropionyl-CoA ddiydratase- 
-encoding PGR product was digested with Xbal and Ndel restriction enzymes, heated at 
65°C for 30 minutes to inactivate the restriction enzymes, and ligated into pTD. The 

ligationjmixtui^^ (Novagen) 

15 that then were plated on LB plates supplemented with 50 ng/mL carbenicillin. 

Individual colonies were selected, and the plasmid DNA obtained using a Qiagen 
Spin Miniprep Kit The obtained plasmids were digested vnihXbal and Ndel restriction 
enzymes and analyzed by gel electrophoresis. pTD plasmids containing the inserted 3- 
_Jhydrox}^mpiony^ While 
20 expression of the lactyl-Co A hydratase, CoA transferase, and 3-hydroxypropionyl-CoA 
dehydratase sequences from pHTD was directed by a single T7 promoter, each coding 
sequence had m indiyidusdRBS u^ _ 

To ensure the correct assembly and cloning of the Iactyl-CoA hydratase, CoA 
transferase, and 3-hydroxypropionyl-CoA dehydratase sequences into one operon, both 
25 ends of the operon and all junctions between the coding sequences were sequenced. This 
DNA analysis revealed that the operon was assembled correctly. 

The pHTD plasmid was transformed into BL21(DE3) cells to study the expression 
of the encoded sequences. 
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Example 6 - Construction of operons #3 and #4 
Operon #3 (Figure 36A and B) and operon #4 (Figure 37A and B) each position 
the El activator at the end of the operon. Operon #3 contains a RBS between the 3- 
hydroxypropionyl-CoA dehydratase-encoding sequence and the El activator-encoding 
5 sequence. In operon #4, however, the stop codon of the 3-hydroxypropionyl-CoA 

dehydratase-encoding sequence is fused with the start codon of the El activator-encoding 
sequence as follows: TAGTG. The absence of the RBS in operon #4 can decrease the 
level of El activator expression. 

To construct operon #3, nucleic acid molecules encoding a CoA transferase and a 

1 0 lactyl-CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by 
PCR. Two primers were used to amplify the CoA transferase-encoding sequence 
(OSNBpctF and OSHTR S'-ACGTTGATCTCCTTCTACATTATTTTTTCAGT- 
CCCATG-3% SEQ ID NO:l 17), two primers were-used to amplify the E2 a and p 
subunits of the lactyl-CoA dehydratase-encoding sequence (OSEHXNF 5*- 

15 GGTGTCTAGAGTCAAAGGAGAGAACAAAATCATGAGTG-3\ SEQ ID NO:l 18 
and OSEHXNR 5 * -GGG AATTCC ATATGCGTAACTTCCTCCTGCTATTAGAGGA- 
TTTCCGAGAAAGC-3 ' , SEQ ID NO:l 19), and two primers were used to amplify the El 
activator of the lactyl-CoA dehydratase-encoding sequence (OSHrEIF S'-TCAGTG- 
GTCGTTGATCACGCTATAAAG AAAGGTGAAAACTGTGTATACTCTC-3 *, SEQ 

20 ID NO.-120 and OSEIBR S'-CGACGGATCCCTTCCTTGGAGCT 

SEQ ID NO:121). A nucleic acid molecule encoding a 3-hydroxypropionyl-CoA 
dehydratase was amplified from Chloroflexus aurantiacus genomic DNA of by PCR 
using two primers (OSTHF S'-CATGGGACTGAAAAAATAATGTAGAAGGAGAT- 
CAACGT-3\ SEQ ID NO:122 and GSEIrHR S'-GAGAGTATACACAGTTTTCA- 

25 CCTTTCTTTATAGCGTGATCAACGACCACTGA-3^ SEQ ID NO:123). 

PCR was conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 
genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
Pfu Turbo polymerase (Stratagene; La Jolla, CA) in 8: 1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 

30 initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 
for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
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obtained PCR products were gel purified using a Qiagen Gel Extraction Kit (Qiagen, Inc.; 
Valencia, CA). 

The 3-hydroxypropionyl-CoA dehydratase and El activator PCR products were 
assembled using PCR. The OSHrElF and OSEIrHR primers were complementary to 

5 each other. Thus, the primers could anneal to each other during the PCR reaction 

extending the DNA in both direction. To ensure the efficiency of the assembly, two end 
primeis (OSTHF and OSE1BR) were added to the assembly PCR mixture, which 
contained 100 ng of the 3-hydroxypropionyl-CoA dehydratase PCR product, 100 ng of 
El activator PCR product, and the rTth polymerase/Pfu Turbo polymerase mix described 

10 above. The following PCR conditions were used to assemble the products: 94°C fori 
minute; 20 cycles of 94°C for 30 seconds, 54°C for 30 seconds, and 68 P C for 1.5 minutes; 
and a final extension at 68°C for 5 minutes. 

The assembled PCR product was gel purified and used in a second assembly PCR 
with gel purified the CoA transferase PCR product The OSTHF and OSHTR primers 

1 5 were complementary to each other. Thus, the complementary DNA ends could anneal to 
each other during the PCR reaction extending the DNA in both direction. To ensure the 
efficiency of the assembly, two end primers (OSNBpctF and OSEIBR) were added to the 
second assembly PCR mixture, which contained 100 ng of the purified 3- 
hydroxypropionyl-CoA dehydratase/EI PCR assembly, 100 ng of the purified CoA 

20 transferase PCR product, and the polymerase mix described above. The following PCR 
conditions were used to assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 
30 seconds, 54°C for 30 seconds, and 68°C for 3 minutes; and a final extension at 68°C 
for 5 minutes. 

The assembled PCR product was gel purified and digested with Ndel and BamHI 
25 restriction enzymes. The sites for these restriction enzymes were introduced into the 

assembled PCR products with the OSNBpctF (Ndel) and OSEIBR (BamHI) primers. The 
digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pETl 1 a vector. 

The pET- 1 1 a vector was digested with Ndel and BamHI restriction enzymes, gel 
30 purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 

(Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
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assembled PCR product The ligation was performed at 16°C overnight using T4 ligase 
{Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WI) using a 
heat-shock method. Once heat shocked, the cells were plated on LB plates supplemented 
5 with 50 ng/mL carbeniciliin. The plasmid DNA was purified from individual colonies 
using a QiaPrep Spin Miniprep Kit (Qiagen Inc.; Valencia, CA). The resulting plasmids 
carrying the CoA transferase, 3-hydroxypropionyl-CoA dehydratase, and EI activator 
sequences (pTHrEI) were digested with Xbal and Ndel, purified using gel electrophoresis 
and a Qiagen Gel Extraction kit, and used as a vector for cloning of the E2 a subunit/E2 0 
10 subunit PCR product. 

The E2 a subunit/E2 p subunit PCR product was digested with the same enzymes 
and ligated into the pTHrEI vector. The ligation reaction was performed at 16°C 
overnight using T4 ligase (Roche Molecular Biochemicals; Indianapolis, IN)- The 
ligation mixture was transformed into chemically competent NovaBlue cells (Novagen) 
15 that then w^plated on LB pla^ supplemented with 50 |xg/mL carbeniciliin. The 

plasmid DNA was purified from individual colonies using a QiaPrep Spin Miniprep Kit 
(Qiagen Inc., Valencia, CA) and digested with Xbal and Ndel restriction enzymes for gel 
electrophoresis analysis. The resulting plasmids carrying the constructed operon #3 
— (pEm&EI) weretransfoira of the 

20 cloned sequences. Electrospray mass spectrometry assay confirmed that extracts from 
these cells have CoA transferase activity and 3-hydroxypropionyl-CoA dehydratase 
activity. Similar assays are used to confirm that extracts from these cells also have lactyl- 
CoA rie¥ydralase Mfivity.""" 

To construct operon #4, nucleic acid molecules encoding a CoA transferase and a 
25 lactyl-CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by 
PCR- Two primers were used to amplify the CoA transferase-encoding sequence 
(OSNBpctF and OSHTR), two primers were used to amplify the E2 a and p subunits of 
the lactyl-CoA dehydratase-encoding sequence (OSEEXNF and OSEHXNR), and two 
primers were used to amplify the El activator of the lactyl-CoA dehydratase-encoding 
30 sequence<OSHEEF S^CAACHTCAGTGGTCGTTAGTOAAAACTGTGTAT- 

ACTCTC-3\SEQIDNO:124andOSEIBR). A nucleic acid molecule encoding a 3- 
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hydroxypropionyi-CoA dehydratase was amplified from Chlorqflexus avrantiacus 
genomic DNA of by PCR using two primers (OSTHF and OSEffiR^- 
GAGACTATACACAGTTTTCACTAACGACCACTGAAGTTOG-3', SEQ ID 

NO:125). 

5 PCR was conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 

genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
Pfu Turbo polymerase <Stratagene; La Jolla, CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 

10 for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
obtained PCR products were gel purified using a Qiagen Gel Extraction Kit{Qiagen, Inc.; 
Valencia, CA). 

The 3-hydroxypropionyl-CoA dehydratase and El activator PCR products were 
assembled using PCR. The OSHEIF and OSEIHR primers were complementary to each 

1 5 other. Thus, the primers could anneal to each other during the PCR reaction extending 
the DNA in both direction. To ensure the efficiency of the assembly, two end primers 
(OSTHF and OSElBR) were added to the assembly PCR mixture, which contained 100 
ng of the 3-hydroxypropipnyl-CQA dehydratase PCR product, 100 ng of El activator 
PCR product, and the rTth polymerase/Pfu Turbo polymerase mix described above. The 

20 following PCR conditions were used to assemble the products: 94°C for 1 minute; 20 

cycles of 94°C for 30 seconds, 54°C for 30 seconds, and 68°C for 1.5 minutes; and a final 
extension at 68°C for 5 minutes. 

The assembled PCR product was gel purified and used in a second assembly PCR 
with gel purified the CoA transferase PCR product The OSTHF and OSHTR primers 

25 were complementary to each other. Thus, the complementary DNA ends could anneal to 
each other during the PCR reaction extending the DNA in both direction. To ensure the 
efficiency of the assembly, two end primers {OSNBpctF and OSEIBR) were added to the 
second assembly PCR mixture, which contained 100 ng of the purified 3- 
hydroxypropionyl-CoA dehydratase/EI PCR assembly, 100 ng of the purified CoA 

30 transferase PCR product, and the polymerase mix described above. The following PCR 
conditions were used to assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 
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30 seconds, 54°C for 30 seconds, and 68°C for 3 minutes; and a final extension at 68°C 
for 5 minutes. 

i 

The assembled PCR product was gel purified and digested with Ndel and BamHI 
restriction enzymes. The sites for these restriction enzymes were introduced into the 
5 assembled PCR products with the OSNBpctF {Ndel) and OSEIBR (BamHI) primers. The 
digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pETl la vector. . 

The pET-1 la vector was digested with Ndel and BamHI restriction enzymes, gel 
purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 

10 (Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
assembled PCR product The ligation was performed at 16°C overnight using f 4 ligase 
(Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent-cells (Novagen; Madison, WI) using a 
heat-shock method. Once shocked, the cells were plated on LB plates supplemented with 

15 50 fig/mL carbenicillin. The plasmid DNA was purified from individual colonies using a 
QiaPrep Spin Miniprep Kit (Qiagen Inc., Valencia, CA). The resulting plasmids carrying 
the Co A transferase, 3-hydroxypropionyl-CoA dehydratase, and EI activator sequences 
(pTHEl) were digested vnihXbal and Ndel, purified using gel electrophoresis and a 
Qiagen Gel Extraction kit, and used as a vector for cloning of the E2 a subunit/E2 P 

20 subunit PCR product 

The E2 a subunit/E2 p subunit PCR product was digested with the same enzymes 
andligated into the pTHEl vector. The ligation reaction was performed at 16°C 
overnight using T4 ligase {Roche Molecular Biochemicals, Indianapolis, IN). The 
ligation mixture was transformed into chemically competent NovaBlue cells (Novagen) 

25 that then were plated on LB plates supplemented with 50 ng/mL carbenicillin. The 

plasmid DNA was purified from individual colonies using a QiaPrep Spin Miniprep Kit 
(Qiagen Inc., Valencia, CA) and digested with Xbal and Ndel restriction enzymes for gel 
electrophoresis analysis. The resulting plasmids carrying the constructed operon #4 
(pEHTHEI) were transformed into BL21(DE3) cells to study the expression of the cloned 

30 sequences. Electrospray mass spectrometry assays confirmed that extracts from these 
cells have CoA transferase activity and 3-hydroxypropionyl-CoA dehydratase activity. 
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Similar assays are used to confirm that extracts from these cells also have lactyl^oA 
dehydratase activity. 

K coli plasmid pEUTHrEI carrying a synthetic 3-HP operon was digested with 
Nrul, Xbal and BamKl restriction enzymes, XbaVBamVl DNA fragment was gel purified 

3 with Quagen <5el Extraction Kit<Qiagen, Inc., Valencia CA) and used for further cloning 
into Bacillu vector pWHI520 {MoBiTec BmBH, Gottingen, Germany). Vector 
pWHl 520 was digested with Spel and BamHl restriction enzymes and gel purified with 
Qiagen Gel Extraction Kit. The Xbal-BamHl fragment carrying 3-HP operon was ligated . 
into WH1520 vector at 16°C overnight using T4 ligase. The ligation mixture was 

1 0 transformed into chemically competent TOP 1 0 cells and plated on LB plates 

supplemented with 50 ng/ml carbenicillin. One clone named B. megaterium (pBP026) 
was used for assays of CoA-transferase and CoA-hydratase activities. The assays were 
performed as described above for E. Coli. The enzymatic activity was 5 U/mg and 13 
U/mg re spectively*- — — — - — — 

15 

Example 7 - Construction of a two plasmid system 
The following constructs were constructed and can be used to produce 3-HP in K coli 
(Fi gure 38A and B). N ucleic acid mol ecules en coding a CoA transferase and a lactyl- 
CoA dehydratase were amplifiedfrom Megasphaem elsdenii jgGnoTxdcJMA by PCR- 

20 Two primers were used to amplify the CoA transferase-encoding sequence (OSNBpctF 
and OSHTR), two primers were used to amplify the E2 a and p subunits of the lactyl- 
CoA dehydrata^^ were 
used to amplify the El activator of the lactyl-CoA dehydratase-encoding sequence 
(E1PROF 5 , -GTCGCAGAATTCCCATCAATCGCAGCAATCCCAAC-3^ SEQID 

25 NO: 126 and E1PROR S'-TAACATGGTACCGACAGAAGCGGACCAGCA-AACGA- 
3', SEQ ID NO:127). A nucleic acid molecule encoding a 3-hydroxypropionyl-CoA 
dehydratase was amplified from Chlorqflexus aurantiacus genomic DNA of by PGR 
using two primers (OSTHF and OSHBR S'-CGACGGATCCTCAACGACCA- 
CTGAAGTTGG-3\ SEQ ID NO:128). 

30 PCR was conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 

genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
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Pfu Turbo polymerase (Stiatagene; La Jolla, CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 
for 30 seconds, and 68°C for 2 minutes; and a final extension at^8°C for 5 minutes. The 
5 obtained PCR products were gel purified using a Qiagen Gel Extraction Kit (Qiagen, Inc.; 
Valencia, CA). 

The CoA transferase PCR product and the 3-hydroxypropionyl-CoA dehydratase 
PGR product were' assembled using PCR. The OSTHF and OSHTR primers were 
complementary to each other. Thus, the complementary DNA ends could anneal to each 
10 other during the PCR reaction extending the DNA in both direction. To ensure the 

efficiency of the assembly, two end primers (OSNBpctF and OSHBR) were added to the 
assembly PCR mixture, which contained 100 ng of the purified CoA transferase PCR 
product, 100 ng of the purified 3-hydroxypropionyt-CoA dehydratase PCR product, and 
the polymerase mix described above. The following PCR conditions were used to 
15 assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 30 seconds, 54°C for 30 
seconds, and 68°C for 2.5 minutes; and a final extension at 68°C for 5 minutes. 

The assembled PCR product was gel purified and digested with Ndel and BamHI 
restriction enzymes. The sites for these restriction enzymes were introduced into the 
assembled PCR products with the OSNBpctF (Ndel) and OSHBR (BamHI) primers. The 
20 digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pETl 1 a vector. 

The pET-1 la vector was digested with Ndel md BamHI restriction enzymes, gel 
purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 
(Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
25 assembled PCR product The ligation was performed at 16°C overnight using T4 ligase 
(Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WI) using a 
heat-shock method. Once shocked, the cells were plated on LB plates supplemented with 
50 jig/mL carbenicillin. The plasmid DNA was purified from individual colonies using a 
30 QiaPrep Spin Miniprep Kit {Qiagen Inc.; Valencia, CA) and digested with Ndel and 
BamHI restriction enzymes for gel electrophoresis analysis. The resulting plasmids 
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cairying the CoA transferase. and 3-hydroxypropionyl-CoA dehydratase (pTH) were 
digested with Xbal and Ndel, purified using gel electrophoresis and a Qiagen Gel 
Extraction kit, and used as a vector for cloning of the E2 a subunit/E2 P subunit PCR 
product 

5 The E2 a subunit/E2 p subunit PCR product digested with the same enzymes was 

ligated into the pTH vector. The ligation reaction was performed at 16°C overnight using 
T4 ligase (Roche Molecular Biochemicals; Indianapolis, IN). The ligation mixture was 
transformed into chemically competent NovaBlue cells (Novagen) that then were plated 
on LB plates supplemented with '50 ng/mL carbenicillin. The plasmid DNA was purified 

It) from individual colonies using a QiaPrep Spin Miniprep Kit (Qiagen Inc.; Valencia, CA) 
and digested with Xbal and Ndel restriction enzymes for gel electrophoresis analysis. 
The resulting plasmids cairying the E2 a and P subunits of the lactyi-CoA dehydratase, 
the CpA transferase, and the 3-hydroxypropionyl-GoA dehydratase (pEIITH) were 
-^an sformcd in to-BL2 l <DE3) - cclls ^o study the e xpressi o n of the cloned sequ ences. 

1 5 The gel purified El activator PCR product was digested with EcoRI and Kpnl 

restriction enzymes, heated at 65°C for 30 minutes, and ligated into a vector 
(pPROLar. A) that was digested with EcoRI and Kpnl restriction enzymes, gel purified 
jusing Qia gen Gel Ext raction kit, and treated with shrimp alkaline phosphatase (Roche 
4vloleeulaf45ioehemic^^ 

20 overnight using T4 ligase (Roche Molecular Biochemicals; Indianapolis, IN). The 

resulting ligation reaction was transformed into DH10B electro-competent cells (Gibco 
LifeJTechnologies; Gaithersburg, MD) using electroporation. Once electroporated, the 
cells were plated on LB plates supplemented with 25 jig/mL kanamycin. The plasmid 
DNA was purified from individual colonies using a QiaPrep Spin Miniprep Kit (Qiagen 

25 Inc., Valencia, CA) and digested with EcoRI and Kpnl restriction enzymes for gel 

electrophoresis analysis. The resulting plasmids carrying the El activator (pPROEI) are 
transformed into BL21(DE3) cells to study the expression of the cloned sequence. 

The pPROEI and pEIITH plasmids are compatible plasmids that can be used in 
the same bacterial host cell. In addition, the expression from the pPROEJ and pEIITH 

30 plasmids can be induced at different levels using IPTG and arabinose, allowing for the 
fine-tuning of the expression of the cloned sequences. 
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Example 8 - Production of 3-HP 
3 -HP was produced using recombinant £ coli in a small-scale batch fermentation 
reaction. The construction of strain ALS848 (also designated as TA3476 (J. BacterioL, 
143:1081-1085(1980))) that carried inducible 17 RNA polymerase was performed using 
5 A.DE3 lysogenization kit <Novagen, Madison, WI) according to the manufacture's 

instructions. The constructed strain was designated ALS484(DE3). Strain ALS484(DE3) 
was transformed with pEHTHrEI plasmid using standard electroporation techniques. The 
transformants were selected on LB/carbenicillin (50 jig/mL) plates. A single colony was 
used to inoculate 4 mL culture in a 15 mL culture tube. Strain ALS484(DE3) strain 

10 carrying vector pETl la was used as a control. The cells were grown at 37°C and 250 
ipm in an Innova 4230 Incubator Shaker (New Brunswick Scientific, Edison, NJ) for 
eight to nine hours. This culture (3 mL) was used to start an anaerobic fermentation. 
Two 100 mL anaerobic cultures of ALS(DE3)/pEf 1 la and ALS(DE3)pEHTHrEI were 
grown in serum bottles using LB media supplemented with 0.4% glucose, 50 (ig/mL 

1 5 carbenicillin, and 1 00 mM MOPS. The cultures were grown overnight at 37°C without 
shaking. The overnight grown cultures were sub-cultured in serum bottles using fresh LB 
media supplemented with 0.4% glucose, 50 jig/mL carbenicillin, and 100 mM MOPS. 
The starting OD(600) of these cultures was adjusted to 0.3. These serum bottles were 
incubated at 37°C without shaking. After one hour of Incubation, the cultures were 

20 induced with 100 nM IP TG. A 3 mL sample was taken from each of the serum bottles at 
30 minutes, 1 hour, 2 hours, 4 hours, 6 hours, 8 hours, and 24 hours. The samples were 
transferred into two properly labeled 2 mL microcentrifuge tubes, each containing 1 .5 mL 
sample. The samples were spun down in a microcentrifuge centrifuge at 14000 g for 3 
minutes. The supernatant was passed through a 0.45 \i syringe filter, and the resulting 

25 filtrate was stored at -20°C until further analysis. The formation of fermentation 

products, mainly lactate and 3-hydroxypropionate,. was measured by detecting derivatized 
CoA esters of lactate and 3-HP using LC/MS. 

The following methods were performed to convert lactate and 3-HP into their 
respective CoA esters. Briefly, the filtrates were mixed with CoA-reaction buffer (200 

30 mM potassium phosphate buffer, 2 mM acetyl-CoA, and 0. 1 mg/mL purified transferase) 
in 1 : 1 ratio. The reaction was allowed to proceed for 20 minutes at room temperature. 
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The reaction was stopped by adding 1 volume of 10% TFA. The sample was purified 
using Sep Pak Vac columns (Waters). The column was conditioned with methanol and 
washed two times with 0.1% TFA. The sample was then applied to the column, and the 
column was washed two more times with 0.1% TFA. The sample was eluted with 40% 
5 acetonitrile, 0.1% TFA. The acetonitrile was removed from the sample by vacuum 
centrifugation. The samples were men analyzed by LC/MS. 

Analysis of Ihe standard CoA/CoA thioester mixtures and the CoA thioester 
• mixtures derived from fermentation broths were carried out using a Waters/Micromass 
ZQ LC/MS instrument which had a Waters 2690 liquid chromatograph with a Waters 996 
1 0 Photo-Diode Array (PDA) absorbance monitor placed in series between the 

chromatograph and the single quadrupole mass spectrometer. LC separations were made 
using a 4.6 x 150 mm YMC ODS-AQ (3 urn particles, 120 A pores) reversed-phase 
chromatography column at room temperature. Two gradient elution systems were 
developed using different mobile phases for the separation of the CoA esters. These two 
15 systems are summarized in Table 3. Elution system 1 was developed to provide the most 
rapid and efficient separation of the five-component CoA/CoA thioester mixture (CoA 
acetyl-CoA lactyl-CoA acrylyl-CoA, and propionyl-CoA), while elution system 2 was 
developed to provide baseline separation of the structural^ isomeric esters lactyl-CoA 
and 3HP-CoA in addition to separation of the remaining esters listed above. In all cases, 
20 the flow rate was 0.250 mL/minute, and photodiode array UV absorbance was monitored 
from 200 nm to 400 nm. All parameters of the electrospray MS system were optimized 
and selected based on generation of protonated molecular ions (JM + H] 4 ) of the analytes 
of interest and production of characteristic fragment ions. The following instrumental 
parameters were used for ESI-MS detection of CoA and organic acid-CoA thioesters in 
25 • the positive ion mode: Capillary: 4.0 V; Cone: 56 V; Extractor: 1 V; RF lens: 0 V; Source 
temperature: 100°C; Desolvation temperature: 300°C; Desolvationgas: 500 L/hour, Cone 
gas: 40 L/hour, Low mass resolution: 1 3.0; High mass resolution: 14.5; Ion energy: 0.5; 
Multiplier: 650. Uncertainties for reported mass/charge ratios <m/z) and molecular 
masses are + 0.01%. Table 3 provides a summary of gradient elution systems for the 
30 separation of organic acid-Coenzyme A thioesters. 
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Table 3 



System 


Buffer A 


Buffer B 


Gradient 








Time 


Percent B 


1 


25 mM ammonium acetate 


ACN 


0 


10 




0.5 % acetic acid 


0.5 % acetic acid 


40 


40 








42 


100 








47 


100 








SO 


10 


2 


25 mM ammonium acetate 


ACN 


0 


10 




10 mM TEA. 


0.5 % acetic acid 


10 


10 




0.5 % acetic acid 




45 


60 








50 


100 








53 


100 








54 


10 



The following methods were used to separate the derivatized 3-hydroxypropionyl- 
CoA, which was formed from 3-HP, from 2-hydroxypropionyl-CoA (i.e., lactyl-CoA), 

5 which was formed from lactate. Because these structural isomers have identical masses 
and mass spectral fragmentation behavior, the separation and identification of these 
analytes in a mixture depends on their chromatographic separation. While elution system 
1 provided excellent separation of the CoA thioesters tested (Figure 46), it was unable to 
resolve 3-HP-CoA and lactyl-CoA. An alternative LC elution system was developed 

10 using ammonium acetate and triethylamine (system 2; Table 3). • 

The ability of system 2 to separate 3-HP-CoA and lactyl-CoA was tested on a 
mixture of these two compounds. Comparing the results from a mixture of 3-HP-CoA 
and lactyl-CoA (Figure 47, Panel A) to the results from lactyl-CoA only figure 47, Panel 
B) revealed that system 2 can separate 3 -HP-Co A and lactyl-CoA. The mass spectrum 

15 recorded under peak 1 (Figure 47, Panel A insert) was used to identify peak 1 as being a 
hydroxypropionyl-CoA thioester when compared to Figure 46, Panel C. In addition, 
comparison of Panels A and B of Figure 47 as well as the mass spectra results 
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corresponding to each peak revealed that peak 1 corresponds to 3-HP-CoA and peak 2 

corresponds to lactyl-CoA. 

System 2 was used to confirm that E. coli transfected with pEHTHrEI produced 3- 
HP in that 3-HP-CoA was detected. Specifically, an ion chromatogram for m/z = 840 in 

5 the analysis of a CoA transferase-treated fermentation broth aliquot collected from a 
culture of £ coli containing pEHTHrEI revealed the presence of 3-HP-CoA {Figure 48, 
Panel A). The CoA transferase-treated fermentation broth aliquot collected from a 
culture of £ coli lacking pEHTHrEI did not exhibit the peak corresponding to 3-HP-CoA 
(Figure 48, Panel B). Thus, these results indicate that the pEIITHrEI plasmid directs the 

10 expression of polypeptides having propionyl-CoA transferase activity, lactyl-CoA 

dehydratase activity, and acrylyl-CoA hydratase activity. These results also indicate that 
expression of these polypeptides leads to the formation of 3-HP. 

Example 9 - Cloning nucleic acid m ntecules that encode 
j 5 a polypeptide having acetyl CoA ca rboxylase activity 

Polypeptides having acetyl-CoA carboxylase activity catalyze the first committed 
step of the fatty acid synthesis by carboxylation of acetyl-CoA to malonyl-CoA. 
Polypeptides having acetyl-CoA carboxylase activity are also responsible for providing 
malonyl-CoA for the biosynthesis of very-long-chain fatty acids required for proper cell 
20 function. Polypeptides having acetyl-CoA carboxylase activity can be biotin dependent 
enzymes in which the cofactor biotin is post-translationally attached to a specific lysine 
residue. The reaction catalyzed by such polypeptides consists of two discrete half 
reactions. In the first half reaction, biotin is carboxylated by biocarbonate in an ATP- 
dependent reaction to formcarboxybiotin. In the second half reaction, the carboxyl group 
25 is transferred to acetyl-CoA to form malonyl-CoA. 

Prokaryotic and eukaryotic polypeptides having acetyl-CoA carboxylase activity 
exist. The prokaryotic polypeptide is a multi-subunit enzyme (four subunits), where each 
of the subunits is encoded by a different nucleic acid sequence. For example, in E. coli, 
the following genes encode for the four subunits of acetyl-CoA carboxylase: 
30 accA: Acetyl-coenzyme a carboxylase carboxyl transferase subunit alpha 

XGenBank® accession number M96394) 
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accB: Biotincarboxyi carrier protein (GenBank® accession number U18997) 
accC: Biotin carboxylase {GenBank® accession number Ul 8997) 
accD:, Acetyl-coenzyme a carboxylase carboxyl transferase subunit beta 

(GenBank® accession number M68934) 
5 The eukaryotic polypeptide is a high molecular weight multi-functional enzyme 

encoded by a single gene. For example, in Saccharomyces cerevisiae, the acetyi-CoA 

carboxylase can have the sequence set forth in GenBank® accession number M92156. 
The prokaryotic type acetyl-CoA carboxylase from E. coli was overexpressed 

using T7 promoter vector pFN476 as described elsewhere (Davis et ah 1 Biol Chem., 
10 275:28593-28598 <2000)). The eukaryotic type acetji-CoA carboxylase gene was 

amplified from Saccharomyces cerevisiae genomic DNA. Two primers were designed to 

amplify the acc\ gene from in S. cerevisiae (acclF 5'- 

atagt^GGCCGCAG C-3\ SEQ ID 

NO: 138 where the bold is homologous sequence, the italics is a Not I site, the underline 

15 is a RBS, and the lowercase is extra; and acclR S'-atgctcgcatCTCG^CTTAG- 

CTAAATTAAATTACATCAATAGTA-3 SEQ ID NO: 139 where the bold is 
homologous sequence, the italics is aXho I site, and the lowercase is extra). The 
following PCR mix is used to amplify accl gene 10X pfu buffer (10 |iL), dNTP <10mM; 
2 jxL), cDNA<2 \xL\ acclF (100 jiM; 1 jiL), acclR (100 pM; 1 ^L), pfu enzyme (2.5 

20 units/jiL; 2 \*L) 9 and DI water (82 jiL). The following protocol was used to amplify the 
accl gene. After performing PCR, the PCR product was separated on a gel, and the band 
corresponding to accl nucleic acid (about 6.7 Kb) was gel isolated using Qiagen gel 
isolation kit. The PCR fragment is digested with Not I and Xho I (New England BioLab) 
restriction enzymes. The digested PCR fragment is then ligated to pET30a which was 

25 restricted with Not I and Xho I and dephosphorylated with SAP enzyme. The £ coli 
strain DH10B was transformed with 1 of the ligation mix, and the cells were 
recovered in 1 mL of SOC media. The transformed cells were selected on LB/kanamycin 
(50 jig/^iL) plates. Eigjit single colonies are selected, and PCR was used to screen for the 
correct insert Hie plasmid having correct insert was isolated using Qiagen "Spin Mini 

30 prep kit 
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To obtain a polypeptide having acetyl-CoA carboxylase activity, the plasmid 
pMSD8 or pET30a/accl overexpressing K coli or S. cerevisiae acetyl-CoA carboxylase 
was transformed into Tuner pLacI chemically competent cells (Novagen, Madison, WT). 
The transformed cells were selected on LB/chloramphenicol (25 Ug/mL) pluscarbencillin 

5 C50ug/mL)orkanamycm(50ug/mL). 

A crude extract of this strain can be prepared in the following manner. An 
overnight culture of Tuner pLacI with pMSD8 is subcultured into 200 mL (in one liter 
baffle culture flask) of fresh M9 media supplemented with 0.4% glucose, 1 jxg/mL 
thiamine, 0.1% casamino acids, and 50 ug/mL earbencillin or 50 ug/mL kanamycin and 

10 -25 ug/mL chloramphenicol. The culture is grown at 37°C in a shaker with 250 rpm 

agitation until it reaches an optical density at 600 nm of about 0.6. IPTG is then added to 
a final concentration of 100 uM. The culture is then incubated for an additional 3 hours 
with shaking speed of 250 rpm at 37°C. Cells are harvested by centrifugation at 8000 x g 
and are washed one time with 0.85% NaCl. The cell pellet was resuspended in a rninimal 

15 volume of 50 mM Tris-HCl (pH 8.0), 5 mM MgCh, 100 mM KC1, 2 mM DTT, and 5% 
glycerol. The cells are lysed by passing mem two times through a French Pressure cell at 
1 000 psig pressure. The cell debris was removed by centrifugation for 20 rninutes at 
30,000 xg. 

The enzyme can be assayed using a method from Davis et ci. {J. Biol Chem., 
20 275:28593-28598(2000)). 

Example 10 - Ponine a nucleic acid molecule t hat encodes a polypeptide 
having malnnvl-CoA reductase activity from Chl oroflexus auarantiacus 
A polypeptide having malonyl-CoA reductase activity was partially purified from 
25 Chloroflexus auarantiacus and used to obtained amino acid micro-sequencing results. 
The amino acid sequencing results were used to identify and clone the nucleic acid mat 
encodes a polypeptide having malonyl-CoA reductase activity. 

Biomass required for protein purification was grown in B. Braun BIOSTAT B 
fermenters (B. Braun Biotech International GmbH, Melsungen, Germany). Aglass vessel 
30 fitted with a water jacket for heating was used to grow the required biomass. The glass 
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vessel was connected to its own control unit. The liquid working volume was 4 L, and 
the fermenter was operated at 55°C with 75 rpm of agitation. Carbon dioxide was 
occasionally bubbled through the headspace of the fermenter to maintain anaerobic 
conditions. The pH of the cultures was monitored using a standard pH probe and was 
5 maintained between 8.0 and 8.3. The inoculum for the fermenters was grown in two 250 
mL bottles in an Innova 4230 Incubator, Shaker (New Brunswick Scientific, Edison, NJ) 
at 55°C with interior lights. The fermenters were illuminated by three 65 W plant light 
reflector lamps {General Electric, Cleveland, OH). Each lamp was placed approximately 
50 cm away from the glass vessel. The media used for the inoculum and the fermenter 

10 culture was as follows per liter. 0.07 g EDTA, 1 mL micronutrient solution, 1 mL FeCfe 
solution, 0.06 g CaS0 4 -2 H 2 0, 0.1 g MgS0 4 -7 H 2 0, 0.008 g NaCl, 0.075 g KC1, 0.103 g 
KN0 3 , 0.68 g NaN0 3 , 0.11 1 g Na2HP0 4 , 0.2 g NH4CI, 1 g yeast extract, 2.5 g casamino 
acid, 0.5 g Glycyl-Glycine, and 900 mL DI water/ The micronutrient solution contained 
the following per liter: 0.5 mL H2SO4 (cone.), 2.28 g MnS0 4 -7 H 2 0, 0.5 g ZnS0 4 -7 H 2 0, 

15 0.5 g H3BO3, 0.025 g CuS0 4 -2 H 2 0, 0.025 g Na 2 Mo0 4 -2 H 2 0, and 0.045 g CoCl r 6 H 2 0. 
The FeCb solution contained 0.2905 g FeCl 3 per liter. After adjusting the pH of the 
media to 8.2 to 8.4, 0.75 g/L Na 2 S-9H 2 0 was added, the pH was readjusted to 82 to 8.4, 
and the media was filter-sterilized through a 0.22 \i filter. 

The fermenter was inoculated with 500 mL of grown culture. The fermentation 

20 was stopped, and the biomass was harvested after the cell density was about 0.5 to 0.6 
units at 600 nm. 

The cells were harvested by centrifugation at 5000 x g (Beckman JLA 8.1000 
rotor) at 4°C, washed with 1 volume of ice cold 0.85% NaCl, and centrifuged again. The 
cell pellet was resuspended in 30 mL of ice cold 100 mM Tris-HCl (pH 7.8) buffer that 

25 was supplemented with 2 mM DTT, 5 mM MgCl 2 , 0.4 mM PEFABLOC (Roche 

Molecular Biochemicals, Indianapolis, IN), 1% streptomycin sulfate, and 2 tablets of 
Complete EDTA-free protease inhibitor cocktail (Roche Molecular Biochemicals, 
Indianapolis, IN). The cell suspension was lysed by passing the suspension, three times, 
through a50 mL French Pressure Cell operated at 1600 psi (gauge reading). Cell debris 

30 was removed by centrifugation at 30,000 x g (Beckman JA 25.50 rotor). The crude 
extract was filtered prior to chromotography using a 0.2 jim HT Tuffryn membrane 
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syringe filter (Pall Corp., Ann Arbor, MI). The protein concentration of the crude extract 
was 29 mg/mL, which was determined using the BioRad Protein Assay according to the 
manufacturer's microassay protocol. Bovine gamma globulin was used for the standard 
curve determination. This assay was based on the Bradford dye-binding procedure 

5 (Bradford, Anal Biochem., 72:248 <1 976)). 

Before starting the protein purification, the following assay was used to determine 
the activity of malonyl-CoA reductase in the crude extract A 50 \iL aliquot of the cell 
extract <29 mg/mL) was added to 10 jiL 1M Tris-HCl (final concentration in assay 100 
mM), 10 |iL 10 mM malonyl CoA (final concentration in assay 1 mM), 5.5 5.5 mM 

10 NADPH (final concentration in assay 0,3 mM), and 24.5 pL DI water in a 96 well UV 
transparent plate (Corning, NY). The enzyme activity was measured at 45°C using 
SpectraMAX Plus 96 well plate reader (Molecular devices Sunnyvale, CA). The activity 
of malonyl-CoA reductase was monitored by measuring the disappearance of NADPH at 
340 nm wavelength. The crude extract exhibited malonyl-CoA reductase activity. 

15 The 5 mL (total 145 mg) protein cell extract was diluted with 20 mL buffer A (20 

mM ethanolamine (pH 9.0), 5 mM MgCU, 2 mM DTT). The chromatographic protein 
purification was conducted using a BioLogic protein purification system (BioRad 
Hercules, CA). The 25 mL of cell suspension was loaded onto a UNO Q-6 ion-exchange 
column that had been equilibrated with buffer A at a rate of 1 mlVminute. After sample 

20 loading, the column was washed with 2.5 times column volume of buffer A at a rate of 2 
mL/minute. The proteins were eluted with a linear gradient of NaCl in buffer A from 0- 
0.33 M in 25 Column volume. During the entire chromatographic separation, three mL 
fractions were collected. The collection tubes contained 50 jiL of Tris-HCl (pH 6.5) so 
that die pH of the eluted sample dropped to about pH 7. Major chromatographic peaks 

25 were detected in the region that corresponded to fractions 1 8 to 21 and 23 to 30. A 200 
HL sample was taken from these fractions and concentrated in a microcentrifuge at 4°C 
using a Mkrocon YM-10 columns (Millipore Corp., Bedford, MA) as per manufacture's 
instructions. To each of the concentrated fraction, buffer A-Tris (100 mM Tris-HCl (pH 
7.8), 5 mM MgCl 2 , 2 mM DTT) was added to bring the total volume to 100 \>L. Each of 

30 these fractions was tested for the malonyl-CoA reductase activity using the 

spectophotometric assay described above. The majority of specific malonyl CoA activity 
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was found in fractions 18 to 21. These fractions were pooled together, and the pooled 
sample was desalted using PD-10 column (Amersham Pharmacia Piscataway, NJ) as per 
manufacture's instructions. 

The 10.5 mL of desalted protein extract was diluted with buffer A-Tris to a 

5 volume of 25 mL. This desalted diluted-sample was applied to a 1 mL HiTrap Blue 
column (Amersham Pharmacia Piscataway, NJ) which was equilibrated with buffer A- 
Tris. The sample was loaded at a rate of 0.1 mL/minute. Unbound proteins were washed 
with 2.5 CV buffef A-Tris. The protein was eluted with 100 Mm Tris (pH 7.8), 5 mM 
MgCh, 2 mM DTT, 2mM NADPH, and I 'M NaCl. During this separation process, one 

10 mL fractions were collected. A 200 nL sample was drawn from fractions 49 to 54 and 
concentrated. Buffer A-Tris was added to each of the concentrated fractions to bring the 
total volume to 1 00 pL. Fractions were assayed for enzyme activity as described above. 
The highest specific activity was observed in fraction 51 • The entire fraction 5 1 was 
concentrated as described above, and the concentrated sample was separated on an SDS- 

15 PAGE gel. 

Electrophoresis was carried out using a Bio-Rad Protean II minigel system and 
pre-cast SDS-PAGE gels (4-15%), or a Protean II XI system and 16 cm x 20 cm x 1mm 
SDS-PAGE gels (10%) cast as per the manufacturer's protocol. The gels were run 
according to the manufacturer's instructions with a running buffer of 25 mM Tris-HCl 
20 (pH 8.3), 192 mM glycine, and 0.1% SDS. 

A gel thickness of 1 mm was used to run samples from fraction 51. Protein from 
fraction 51 was loaded onto 10% SDS-PAGE (3 lanes, each containing 75 fig of total 
protein). The gels were stained briefly with Coomassie blue (Bio-Rad, Hercules, CA) and 
then destained to a clear background with a 10% acetic acid and 20% methanol solution. 
25 The staining revealed a band of about 130 to 140 KDa. 

The protein band of about 130-140 KDa was excised with no excess unstained gel 
present An equal area gel without protein was excised as a negative control. The gel 
slices were placed in uncolored microcentrifuge tubes, prewashed with 50% acetonitrile 
in HPLC-grade water, washed twice with 50% acetonitrile, and shipped on dry ice to 
30 Harvard Microchemistry Sequencing Facility, Cambridge, MA. 
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After in-sitii enzymatic digestion of the polypeptide sample with trypsin, the 
resulting polypeptides were separated by micro-capillary reverse-phase HPLC. The 
HPLC was directly coupled to the nano-electrospray ionization source of a Finnigan LCQ 
quadrupole ion trap mass spectrometer (uLC/MS/MS). Individual sequence spectra 

5 {MS/MS) were acquired on-line at high sensitivity for the multiple polypeptides separated 
during the chromatographic run. The MS/MS spectra of the polypeptides were correlated 
with known sequences using the algorithm Sequest developed at the University of 
Washington (Eng et al, J. Am. Soc Mass Spectrom., 5:976 (1994)) and programs 
developed at harvard (Chittum er fl/., Biochemistry, 31 :10m(\99S)). The results were 

10 reviewed for consensus with known proteins and for manual confirmation of fidelity. 

A similar purification procedure was used to obtain another sample (protein 1 
sample) that was subjected to the same analysis that was used to evaluate the fraction 51 
sample. 

The polypeptide sequence results indicated thafme polypeptides obtained from 
15 both the fraction 51 sample and the protein 1 sample had similarity to the six (764, 799, 
859, 923, 1090, 1024) contigs sequenced from the C. awantiacus genome and presented 
on the Joint Genome Institute's web site (http://www.jgi.doe.gov/). The 764 contig was 
the most prominent of the six with about 40 peptide sequences showing similarity. 
BLASTX analysis of each of these contigs on the GenBank web site 
20 Xnttp://www.ncbi.nlm.n^ 

contig(4201 bases) encoded for polypeptides that had a dehydrogenase/reductase type 
activity. Glose inspection of the-764 eohti& however, revealed that this contig did not 
have an appropriate ORF that would encode for a 130-140 KDa polypeptide. 

BASLTX analysis also was conducted using the other five contigs. The results of 
25 this analysis were as follows. The 799 contig (3173 bases) appeared to encode 

polypeptides having phosphate and dehydrogenase type activities. The 859 contig<5865 
bases) appeared to encode polypeptides having synthetase type activities. The 923 contig 
(5660 bases) appeared to encode polypeptides having elongation factor andsynthetase 
type activities. The 1090 contig (15201 bases) appeared to encode polypeptides having 
30 dehydrogenase/reductase and cytochrome and sigma factor activities. The 1024 contig 
(12276 bases) appeared to encode polypeptides having dehydrogenase and decarboxylase 
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and synthetase type activities. Thus, the 859 and 923 contigs were eliminated torn any 
further analysis. 

The results from the BLASTX analysis also revealed that the dehydrogenase 
found in the 1024 contig was most likely an inositol monophosphate dehydrogenase. 

*5 Thus, the 1024 contig was eliminated as a possible candidate that might encode for a 
polypeptide having malonyl-CoA reductase activity. The 799 contig also was eliminated 
since this contig is part of the OS 1 7 polypeptide described above. 

This narrowed down the search to 2 contigs, the 764 and 1090 contigs. Since the 
contigs were identified using the same protein sample and the dehydrogenase activities 

10 found in these contigs gave very similar BLASTX results, it was hypothesized that they 
. are part of the same polypeptide. Additional evidence supporting this hypothesis was 
obtained from the discovery that the 764 and 1 090 contigs are.adjaceritto each other in 
the C. aurantiacus genome as revealed by an analysis of scaffold data provided by the 
Joint Genome Institute. Sequence similarity and assembly analysis, however, revealed no 

15 overlapping sequence between these two contigs, possibly due to the presence of gaps in 
the genome sequencing. 

The polypeptide sequences that belonged to the 764 and 1090 contigs were 
mapped. Based on this analysis, an appropriate coding frame and potential start and stop 
codons were identified. The following PCR primers were designed to PCR amplify a 

20 fragment that encoded for a polypeptide having malonyl-CoA reductase activity: 

PRO140F 5-ATGGCGAGGGGCGAGTCCATGAG-3 1 , SEQ ID NO:153; PRO140R5'- 
GGACACGAAGAACAGGGCGACAC-3 VSEQ ID NO:154; and PRO140UP S'- 
GAACTGTCTGGAGTAAGGCTGTC-3 1 , SEQ ID NO:155. The PRO140F primer was 
designed based on the sequence of the 1 090 contig and corresponds to the start of the 

25 potential start codon. The twelfth base was change from G to C to avoid primer-dimer 
formation. This change does not change the amino acid that was encoded by the codon. 
The PRO140R primer was designed based on sequence of the 764 contig and corresponds 
to a region located about 1 kB downstream from the potential stop codon. The 
PRO140UPF primer was designed based on sequence of the 1090 contig and corresponds 

30 to a region located about 300 bases upstream of potential start codon. 
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Genomic C aurantiacus DNA was obtained. Briefly, G aurantiacus was grown 
in 50 mL cultures for 3 to 4 days. Cells were pelleted and washed with 5 mL of a 10 mM 
Tris solution. The genomic DNA was then isolated using the gram positive bacteria 
protocol provided with Gentra Genomic ^Puregene" DNA isolation kit^Gentra Systems, 

5 ' Minneapolis, MN). The cell pellet was resuspended in 1 mL Gentra Cell Suspension 
Solution to which 14.2 mg of lysozyme and 4 pL of 20 mg/mL proteinase K solution was 
added. The cell suspension was incubated at 37°C for 30 minutes. The precipitated 
genomic DNA was recovered by centrifugation at 35O0g for 25 minutes and air-dried for 
1 0 minutes. The genomic DNA was suspended in an appropriate amount of a 1 0 mM 

10 Tris solution and stored at 4°C. 

Two PCR reactions were set-up using C aurantiacus genomic DNA as template 
as follows: 



PCR Reaction #1 




PCR proeram 


3.3 X rTH polymerase Buffer 


30 uL 


94°C 2 minutes 


Mg(OAC) (25 mM) 


4uL 


29 cycles of: 


dNTP Mix (10 mM) 


3uL 


94°C 30 seconds 


PRO140F (100 pM) 


2uL 


63°C 45 seconds 


PRO140R(100pM) 


2nL 


68°C 4.5 minutes 


Genomic DNA (100 ng/mL) 


\[iL 


68°C 7 minutes 


rTH polymerase (2 U/pL) 


2|iL 


4°C Until further use 


pfii polymerase <2.5 U/pL) 


0.25 pL 




DI water 


55.75 pL 




Total 


100 pL 




PCR Reaction #2 




PCR Droeram 


3.3 X rTH polymerase Buffer 


30 pL 


94°C 2 minutes 


Mg(OAC) (25 mM) 


4pL 


29 cycles of: . 


dNTP Mix (10 mM) 


3pL 


^"C 30 seconds 


PRO140UPF (100 pM) 


2pL 


60°C 45 seconds 


PRO140R(100pM) 


2pL 


68°C 4.5 minutes 
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Genomic DNA (100 ng/mL) 
rTH polymerase (2 U/jiL) 
pfu polymerase 2.5 U/^L) 
DI water 



2jiL 



68°C 7 minutes 



4°C Until further use 



5 Total 



0.25 pL 
55.75 pL 
100 pL 



The products from both PCR reactions were separated on a 0.8% TAE gel. Both 
PCR reactions produced a product of 4.7 to 5 Kb in size. This approximately matched the 
expected size of a nucleic acid molecule that could encode a polypeptide having malonyl- 

10 CoA reductase activity. 

Both PCR products were sequenced using sequencing primers <1090Fseq 5- 
<}ATTCCGTATGTCACCCCTA-3^ SEQ ID NO:156; and 764Rseq 5'- 
CAGGCGACTGGCAATCACAA-3 1 , SEQ ID NO:157). Hie sequence analysis revealed 
a gap between the 764 and 1 090 contigs. The nucleic acid sequence between the 

1 5 sequences from the764 and 1 090 contigs was greater than 300 base pairs in length (Figure 
5 1). In addition, the sequence analysis revealed an ORF of 3678 bases that showed 
similarities to dehydorgenase/reductase type enzymes (Figure 52). The amino acid 
sequence encoded by this ORF is 1225 amino acids in length (Figure 50). Also, BLASTP 
"Mdysisi>f1te~a^^ short chain 

20 dehydrogenase domains (adh type). These results are consistent with a polypeptide 
having malonyl-CoA reductase activity since such an enzyme involves two reduction 
steps for the conversion of malonyl CoA to 3-HP. Further, the computed MW of the 
~^Fypeptide was determined to beabo3T34 KDaT 

PCR was conducted using the PRO140F/PRO140R primer pair, C aurantiacus 

25 genomic DNA, and the protocol described above as PCR reaction #1 . After the PCR was 
completed, 0.25 U of Tag polymerase (Roche Molecular Biochemicals, Indianapolis, IN) 
was added to the PCR mix, which was then incubated at 72°C for 10 minutes. The PCR 
product was column purified using Qiagen PCR purification kit (Qiagen Inc., Valencia, 
CA). The purified PCR product was then TOPO cloned into expression vector 

30 pCRT7/CT as per manufacture's instructions (Invitrogen, Carlsbad, CA). TOP10 F' 
chemical competent -cells were transformed with the TOPO ligation mix as per 
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manufacture's instructions (Invitrogen, Carlsbad, CA). The cells were recovered for half 
an hour, and the transformants were selected on LB/ampicillin (100 ng/mL)' plates. 
Twenty single colonies were selected, and the plasmid DNA was isolated using Qiagen 
spin Mini prep kit<Qiagen Inc., Valencia, CA), 
5 Each of these twenty clones were tested for correct orientation and right insert size 

by PCR. Briefly, plasmid DNA was used as a template, and the following two primers 
were used in the PCR amplification: PCRT7 5-GAGACCACAACGGTTTCCCTCTA- 
3', SEQ ID NO:158; and PRO140R 5M3GACACGAAGAACAGGGCGACAC-3', SEQ 
ID NO: 1 59. The following PCR reaction mix and program was used: 

10 

PCR Reaction PCR program 





3.3 X rTH polymerase Buffer 


7.5 uL 


94°C 2 minutes 




Mg(OAC)<25mM) 


luL 


25 cycles of: 


15 


dNTPMix(lOmM) 


0.5 uL 


94°C 3 0 seconds 




PCRT7 (100 uM) 


0.125 nL 


55°C 45 seconds 




PRO140R(100uM) 


0.125 uL 


68°C 4 minutes 




Plasmid DNA 


0.5 uL 


68°C 7 minutes 




rTH polymerase (2 U/uL) 


0.5 nL 


4°C Until further use 


20 


DI water 


14.75 uL 






Total 


25 uL 





Out of twenty clone tested, only one clone exhibited the correct insert (Clone # P- 
10). Chemical competent cells of BL21(DE3)pLysS (Invitrogen, Carlsbad, CA) were 
25 transformed with 2 uL of the P-10 plasmid DNA as per the manufacture's instructions. 
The cells were recovered at 37°C for 30 minutes and were plated on LB ampiciHin (100 
ug/mL) and chloramphenicol (25 ug/mL). 

A 20 mL culture of BL21(DE3)pLysS/P-10 and a 20 mL control culture of 
BL21(DE3)pLysS was incubated overnight Using the overnight cultures as an inoculum, 
30 two 100 mL BL21(DE3)pLysS/P-10 clone cultures and two control strain cultures 

(BL21(DE3)pLysS) were started. All the cultures were induced with EPTG when they 
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reached an OD of about 0.5 at 600 nm. The control strain culture was induced with 10 
jiM IPTG or 100 pM IPTG, white one of the BL21{DE3)pLysS/P-10 clone cultures was 
induced with 10 pM IPTG and the other with 100 pM IPTG. The cultures were grown for 
2.5 hours after induction. Aliquots were taken from each of the culture flasks before and 
5 after 2.5 hours of induction and separated using 4-15% SDS-PAGE to analyze 
polypeptide expression. In the induced BL21(DE3)pLysS/P-10 samples, a band 
corresponding to a polypeptide having a molecular weight of about 135 KDa was 
observed. This band was absent in the control strain samples and in samples taken before 
IPTG induction. 

10 To assess malonyl-CoA reductase activity, BL21<DE3)pLysS/P-10 and 

BL2 1 (DE3)pLysS cells were cultured and then harvested by centrifugation at 8,000 x g 
(Rotor JA 16.250, Beckman Coulter, Fullerton, CA). Once harvested, the cells were 
washed once with an equal volume of a 0.85% NaGl solution. The cell pellets were 
resuspended into 100 mM Tris-HCl buffer that was supplemented with 5 mM Mg2Cl and 

15 2 mMDTT. The cells were disrupted by passing twice through a French Press Cell at 
1 ,000 psi pressure (Gauge value). The cell debris was removed by centrifugation at 
30,000 x g (Rotor JA 25.50, Beckman Coulter, Fullerton, CA). The cell extract was 
maintained at 4°C or on ice until further use. 

Activity of malonyl-CoA reductase was measured at 37°C for both the control 

20 cells and the IPTG-induced cells. The activity of malonyl-CoA reductase was monitored 
by observing the disappearance of added NADPH as described above. No activity was 
found in the cell extract of the control strain, while the cell extract from the IPTG-induced 
BL2ipE3)pLysS/P-10 cells displayed malonyl-CoA reductase activity with a specific 
activity calculated to be about 0.0942 pmole/minute/mg of total protein. 

25 Malonyl-CoA reductase activity also was measured by analyzing 3-HP formation 

from malonyl CoA using the following reaction conducted at 37°C: 



Volume Final cone. 

TrisHCl(lM) 10 nL lOOmM 

Malonyl CoA (lOmM) 40 nL 4 mM 

30 NADPH (10 mM) 30 fiL 3 mM 

Cell extract 20|iL 
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Total 100 \jL 

The reaction was carried out at 37°C for 30 minutes. In the control reaction, a^ell 
extract from BL21(DE3)pLysS was added to a final concentration of 322 mg total 
5 protein. In the experimental reaction mix, a cell extract from BL21<DE3)pLysS/P-10 was 
added to a final concentration of 226 mg of total protein. The reaction mixtures were 
frozen at -20°C until further analysis. 

Chromatographic separation of the components in the reaction mixtures was 
performed using a HPX-87H (7.8x300mm) organic acid HPLC column (BioRad 
10 Laboratories , Hercules, CA). The column was maintained at 60°C. Mobile phase 

composition was HPLC grade water pH to 2.5 using triflouroacetic acid (TFA) and was 
delivered at a flow rate of 0.6 mL/minute. 

Detection of 3-HP in the reaction samples was accomplished using a 
^Waters/Micromass ZQ LC/MS instrument consisting of a Waters 2690 liquid 
15 chromatograph (Waters Corp., Milford, MA) with a Waters 996 Photo-diode Array 
(PDA) absorbance monitor placed in series between the chromatograph and the single 
quandrupole mass spectrometer. The ionization source was an Atmospheric Pressure 
JChemical Ionization (APCI) ionization source. All parameters of the APCI-MS system 
-were-optimized and selected based on the generation of the protonated molecular ion 
20 (jM+H]) + of 3-HP. The following parameters were used to detect 3-HP in the positive 
ion mode: Corona: 10 *iA; Cone: 20V; Extractor. 2V; RF lens: 0.2V; Source temperature: 
100°C; APCI Probe temperature: 300°C; Desolvation gas: 500L/hour; Cone gas: 
SOUhour, Low mass resolution: 15; High mass resolution: 15; Ion energy: 1.0; 
Multiplier: 650. Data was collected in Selected Ion Reporting (SIR) mode set at m/z - 
25 90.9. 

Both the control reaction sample and the experimental reaction sample were 
probed for presence of 3-HP using the HPLC-mass spectroscopy technique. In the 
control samples, no 3-HP peak was observed, while the experimental sample exhibited a 
peak that matched the retention and the mass of 3-HP. 

30 
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Example 11 - Constructing recombinant cells that produce 3-HP 
A pathway to make 3-hydroxypropionate directly from glucose via acetyl CoA is 
presented in Figure 44. Most organisms such as £ coli, Bacillus, and yeast produce 
acetyl CoA from glucose via glycolysis and the action of pyruvate dehydrogenase. In 
5 order to divert the acetyl CoA generated from glucose, it is desirable to overexpiess two 
genes, one encoding for acetyl CoA carboxylase and the other encoding malonyl-CoA 
reductase. As an example, these genes are expressed in £ coli through a T7 promoter 
using vectors pET30a and pFN476. The vector pET30a has a pBR ori and kanamycin 
resistance, while pFN476 has pSClOl ori and uses carbenciliin resistance for selection. 
1 0 Because these two vectors have compatible ori and different markers they can be 

maintained in £ coli at the same time. Hence, the constructs used to engineer £ coli for 
direct production of 3-hydroxypropionate from glucose are pMSD8 (pFN476/accABCD) 
(Davis et al. 9 X Biol. Chem., 275:28593-28598, 2000) and P ET30a/malonyl-CoA 
reductase or pET30a/accl and pFN476/malonyl-CoA reductase. The constructs are 
15 depicted in Figure 45. 

To test the production of 3-hydroxypropionate from glucose, £ coli strain Tuner 
pLacI carrying plasmid pMSD8 (pFN476/accABCD) and pET30a/malonyl-CoA 
reductase or pET30a/accl and pFN476/malonyl-CoA reductase are grown in a B. Braun 
BiOSTAT B fermenter. A glass vessel fitted wiffiTa water jacket for heating is used to 
20 conduet-this experiment The fennenter working volume isl .5 L and is operated at 37°C. 
The fermenter is continuously supplied with oxygen by bubbling sterile air through it at a 
rate of 1 wm. The agitation is cascaded to the dissolve oxygen concentration which is 
maintained at 40% DO. The pH of the liquid media is maintained at 7 using 2 N NaOH. 
The £ coli strain is grown in M9 media supplemented with 1% glucose, 1 jig/mL 
25 thiamine, 0. 1% casamino acids, 10 pg/mL biotin, 50 ng/mL carbenciliin, 50 pg/mL 
kanamycin, and 25 jig/mL chloramphenicol. The expression of the genes is induced 
when the cell density reached 0.5 OD(600nm) by adding 100 nM IPTG. After induction, 
samples of 2 mL volume are taken at 1, 2, 3, 4, and 8 hours. In addition, at. 3 hours after 
induction, a 200 mL sample is taken to make a cell extract. The 2 mL samples are spun, 
30 and the supernatant is used to analyze products using LC/MS technique. The supernatant 
is stored at -20°C until further analysis. 
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The extract is prepared by spinning the 200 mL of cell suspension at 8000 g and 
washing the cell pellet with of 50 mL of SO mM Tris-HCl<pH 8.0), 5 mM MgClj, 100 
mM KC1, 2 mM DTT, and 5% glycerol. The cell suspension is spun again at 8000 g, and 
the pellet is resuspended into 5 mL of 50 mM Tris-HCl <pH 8.0), 5 mM MgCl* 100 mM 

5 KC1, 2 mM DTT, and 5% glycerol. The cells are disrupted by passing twice through a 
French Press at 1000 pisg. The cell debris is removed by centrifugation for 20 minutes at 
30,000 _ g."M"me 6pera¥6nsl^c6nduc^ at 4°C~. To demonsttated in vitro formation of 
3-hydroxypropionate using this recombinant cell extract, the following reaction of 200 pL 
is conducted at 37°C. The reaction mix is as follows: Tris HC1 {pH 8.0; 100 mM), ATP 

10 (1 mM), MgC12 (5 mM), KC1 (100 mM), DTT<5 mM), NaHC03 (40 mM), NADPH (0.5 
mM), acetyl CoA (0.5 mM), and cell extract<0.2 mg). The reaction is stopped after 15 
minutes by adding 1 volume of 10% trifluroacetic acid (TFA). The products of this 
reaction are detected using an LC/MS technique. 

The detection and analysis for the presence of 3--hydroxyplopionate in the 

1 5 supernatant and the in vitro reaction mixture is carried out using a Waters/Micromass ZQ 
LC/MS instrument This instrument consists of a Waters 2690 liquid chromatograph with 
a Waters 2410 refractive index detector placed in series between the chromatograph and 
the single quadropole mass spectrometer. LC separations are made using a Bio-Rad 
AnunexTO? ion-exchange column at 45°C. Sugars, alcohol; and orpmcaad products 
20 are eluted with 0.015% TFA buffer. For elution, the buffer is passed at a flowrate of 0.6 
mL/minute. For detection and quantification of 3-hydroxypropionate, a sample obtained 
from TCI, America (Portland, OR) is used as a standard. 

Exam ple 12 Cloning of propionvl-CoA trans ferase. lactyl-CoA dehydratase (LPH), 
25 and a hvdratase (OSl<ft for Expression i n Saccharomvces cerevisiae 

The pESC Yeast Epitope Tagging Vector System (Stratagene, La Jolla, CA) was 

used in cloning the genes involved in 3-hydroxypropionic acid production via lactic acid. 

The pESC vectors each contain GAL1 and GAL10 promoters in opposing directions, 

allowing the expression of two genes from each vector. The GAL1 and GAL10 
30 promoters are repressed by glucose and induced by galactose. Each of the four available 

pESC vectors has a different yeast-selectable marker (fflS3, TRP1, LEU2, URA3) so 
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multiple plasmids can be maintained in a single strain. Each cloning region has a 
polylinker site for gene insertion, a transcription terminator, and an epitope coding 
sequence for C-terminal or N-terminal epitope tagging of expressed proteins. The pESC 
vectors also have a ColEl origin of replication and an ampicillin resistance gene to allow 
5 replication and selection in K coli. The following vector/promoter/nucleic acid 
combinations were constructed: 



Vector 


Promoter 


Polypeptide 


Source of nucleic acid 


pESC-Trp 


GAL1 


OS 19 hydratase 


Chlorqflexus aurantiacus 




GAL10 


El 


Megasphaera elsdenii 


pESC-Leu 


GAL1 


E2o 


Megasphaera elsdenii 




GAL10 • 


E2P 


Megasphaera elsdenii 


pESC-His 


GAL1 


D-LDH 


"Escherishia coli 




GAL10 


PCT 


Megasphaera elsdenii 



Hie primers used were as follows: 
10 OS19APAF: 5 ' -ATAGGGCCC AGGAGATC AAACCATGGGTG AAGAGTCT- 
CTGGTTC-3' (SEQIDNO:164) 

OS19SALR: 5'-CCTCTGCTACAGTCGACACAACGACCACTGAAGTTG- 
GGAG-3XSEQ ID NO:165) 

OS19KPNR: S'-AGTCTGCTATCGGTACCTCAACGACCACTGAAGTTG- 
15 GGAG-3'(SEQID.NO:166) 

EINOTF: S'-ATAGCGGCCGCATAATGGATACTCTCGGAATCGACG- 
TTGG-3'(SEQ ID NO:167) 

EICLAR: S^CCCATCGATACATAtTTCTTGATTTTATCATAAGCA- 
ATC-3'(SEQIDNO:168) 
20 EHoAPAF: 5'<JCAGGGCCCATAATGGGTGAAGAAAAAACAGTAtjA- 
TATTG-3*<SEQ ID NO:169) 

EHaSALR: 5'43GTAGACTTGTCGACGTAGTGGTTTCCTCCTTCATT- 
GG-3'(SEQIDNO:170) 

EHpNOTF: 5'-ATAGCGGCCGCATAATGGGTCAGATCXjAGGAACTTA- 
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TCAG-3'(SEQIDNO:171) 

EnpsPER: ^^aggttcaactagttggtagaggatttccgagaaack:- 

CTG-3'(SEQIDNO:172) 

LDHAPAF: S'-CTAGGGCGCATAATGGAACTCGCCGTTTATAG- 
5 CAC-3*(SEQIDNO:173) 

LDHXHOR: S'-ACTTCTGGAGTTAAACCAGTTCGTTCGGGCA- 
GGT-3*(SEQ ID NO: 174) 

PCTSPEF: S'-GGGACTAGTATAATGGGAAAAGTAGAAATCAT- 
TACAG-3 '(SEQ ID NO: 1 75) 
10 PCTPACR: S'-CGGCTTAATTAACAGCAGAGAm 
GTCC-3'{SEQ ID NO:176) 

"All restriction enzymes were obtained from New England Biolabs, Beverly, MA. 
All plasmid DNA preparations were done using QIAprep Spin Miniprep Kits, and all gel 
purifications were done using QIAquick Gel Extraction Kits (Qiagen, Valencia, CA). 

15 

A. Construction of the pESC-Trp/OS19 hvdratase vector 

Two constructs in pESC-Trp were made for the OS 19 nucleic acid from C 
aurantiacus. One of these constructs utilized the Apa I and Sal I restriction sites of the 
GAL1 multiple cloning site and was designed to include the c-myc epitope. The second 
20 construct utilized the Apa I and Kpn I sites and thus did not include the c-myc epitope 
sequence. 

Six jig of pESC-Trp vector DNA was digested with the restriction enzyme Apa I 
and the digest was purified using a QIAquick PCR Purification Column. Three fig of the 
Apa I-digested vector DNA was then digested with the restriction enzyme Kpn I, and 3 \kg 
25 was digested with Sail. The double-digested vector DNAs were separated on a 1% TAE- 
agarose gel, purified, dephosphorylated with shrimp alkaline phosphatase (Roche 
Biochemical Products, Indianapolis, IN), and purified with a QIAquick PCR Purification 
Column. 

The nucleic acid encoding the Chloroflexus aurantiacus polypeptide having 
30 hydratase activity (OS 1 9) was amplified from genomic DNA using the PCR primer pair 
OS19APAF and OS19SALR and the primer pair OS19APAF and OS19KPNR. 
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OS19APAF was designed to introduce an Apa I restriction site and a translation initiation 
site (ACCATGG) at the beginning of the amplified fragment The OS19KPNR primer 
was designed to introduce a Kpn I restriction site at the end of the amplified fragment and 
to contain the translational stop codon for the hydratase gene. OS 1 9S ALR introduces a 

5 Sal I site at the end of the amplified fragment and has an altered stop codon so that 

translation continues in-frame through the vector c-myc epitope. The PGR mix contained 
the following: IX Expand PCR buffer, 100 ng C. aurantiacus genomic DNA, 0.2 pM of 
each primer, 0.2 mM each dNTP, and 525 units of Expand DNA Polymerase (Roche) in 
a final volume of 100 ,|iL. The PCR reaction was performed in an MJ Research PTC 100 

10 under the following conditions: an initial denaturation at 94°C for 1 minute; 8 cycles of 
94°C for 30 seconds, 57°C for 1 minute, and 72°C for 225 minutes; 24 cycles of 94°C for 
30 seconds, 62°C for 1 minute, and 72°Cfor 2.25 minutes; and a final extension for 7 
minutes at 72°C. The amplification product was then separated by gel electrophoresis 
using a 1% TAE- agarose gel. A 0.8 Kb fragment was' excised from the gel and purified 

1 5 for each primer pair. The purified fragments were digested with Kpn I ox Sail restriction 
enzyme, purified with a QIAquick PCR Purification Column, digested with^P* I 
restriction enzyme, purified again with a QIAquick PCR Purification Column, and 
quantified on a minigel. 

5UWng"5Tffie digestedT^^ acid encoding the G 

20 aurantiacus polypeptide having hydratase activity (OS 19) and 50 ng of the prepared 

pESC-Trp vector were ligated using T4 DNA ligase at 1 6°C for 1 6 hours. One }iL of the 
ligation reaction was used to electroporate 40 \*L of £ coli Electromax™ DH10B™ cells. 
The electroporated cells were plated onto LB plates containing 1 00 ^ig/mL of 
carbenicillin (LBC). Individual colonies were screened using colony PCR with the 

25 appropriate PCR primers. Individual colonies were suspended in about 25 of 10 mM 
Tris, and 2 pL of the suspension was plated on LBC media. The remnant suspension was 
heated for 10 minutes at 95°C to break open the bacterial cells, and 2 pL of die heated 
cells was used in a 25 \xL PCR reaction. The PCR mix contained the following: IX Taq 
buffer, 0.2 yiM each primer, 0.2 mM each dNTP, and 1 unit of Taq DNA polymerase per 

30 reaction. The PCR program used was the same as described above for amplification of 
the nucleic acid from genomic DNA. 
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Plasmid DNA was isolated from cultures of colonies having the desired insert and 
was sequenced to confirm the lack of nucleotide errors from PCR. A construct with a 
confirmed sequence was transformed into S. cerevisiae strain YPH500 using a Frozen-EZ 
Yeast Transformation II™ Kit (Zymo Research, Orange, C A). Transformation reactions 

5 were plated on SC-Trp media <see Stratagene pESC Vector Instruction Manual for media 
recipes). Individual yeast colonies were screened for the presence of the OS19 nucleic 
acid by colony PCR. Colonies were suspended in 20 \xL of Y-Lysis Buffer {Zymo 
Research) containing 5 units ofzymolase and heated at 37°C for 10 minutes. Three jiL 
of this suspension was then used in a 25 nL PCR reaction using the PCR reaction mixture 

1 0 and program described for the colony screen of the K coli transfonnants. The pESC-Trp 
vector was also transformed into YPH500 for use as a hydratase assay control and 
transfonnants were screened by PCR using GALl~and GAL10 primers. 

B. Construction of the nESC-Trp/OS 1 9/EI hv dratase vector 

15 Plasmid DNA of a pESC-Trp/OS19 construct (Apa I-So/I sites) with confirmed 

sequence and positive assay results was used for insertion of the nucleic acid for the M 
elsdenii El activator polypeptide downstream of the GAL10 promoter. Three |tg of 
plasmid DNA was digested with the restriction enzyme Cla I, and the digest was purified 
-using a QIAquick PCR Purification Column. The vector DNA was then digested with the 

20 restriction enzyme Not I, and the digest was inactivated by heating to 65°C for 20 

minutes. The double-digested vector DNA was dephosphorylated with shrimp alkaline 
phosphatase {Roche), separated on a 1% TAE-agarose gel, and gel purified 

The nucleic acid encoding the M elsdenii El activator polypeptide was amplified 
from genomic DNA using the PCR primer pair EINOTF and EICLAR. EINOTF was 

25 designed to introduce zNot I restriction site and a translation initiation site at the 

beginning of the amplified fragment The EICLAR primer was designed to introduce a 
Cla I restriction site at the end of the amplified fragment and to contain an altered 
translational stop codon to allow in-frame translation of the FLAG epitope. The PCR mix 
contained the following: IX Expand PCR buffer, 100 ng M elsdenii genomic DNA, 0.2 

30 \M of each primer, 0.2 mM each dNTP, and 3.25 units of Expand DNA Polymerase in a 
final volume of 100 nL. The PCR reaction was performed in an MJ Research PTC100 
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under the following conditions: an initial denaturation at 94°C for 1 minute; 8 cycles off 
94°C for 30 seconds, 55°C for 45 seconds, and 72°C for 3 minutes; 24 cycles of 94°C for 
30 seconds, 62°C for 45 seconds, and 72°C for 3 minutes; and a final extension for 7 
minutes at 72°C. The amplification product was then separated by gel electrophoresis 

5 using a \% TAE-agarose gel, and a 0.8 Kb fragment was excised and purified; The 
purified fragment was digested with Cla I restriction enzyme, purified with a QIAquick 
PCR Purification Column, digested with Not I restriction enzyme, purified again with a 
QIAquick PCR Purification Column, and quantified on a minigel. 

60 ng of the digested PCR product containing the nucleic acid for the M elsdenii 

10 El activator polypeptide and 70 ng of the prepared pESC-Trp/OS19 hydratase vector 
were ligated using T4 DNA ligase at 16°C for 16 hours. One \xL of the ligation reaction 
was used to electroporate 40 of K coli Electromax™ DH10B™ cells. Hie 
electroporated cells were plated onto LBC media, individual colonies were screened 
using colony PCR with the EINOTF and EICLAR primers. Individual colonies were 

15 suspended in about 25 of 10 mM Tris, and 2 \xL of the suspension was plated on LBC 
media. The remnant suspension was heated for 1 0 minutes at 95°C to break open the 
bacterial cells, and 2 of the heated cells used in a 25 nL PCR reaction. The PCR mix 
contained the following: IX Taq buffer, 02 jaM each primer, 0.2 mM each dNTP, and 1 
unit of Taq DNA polymerase per reaction. Hie PCR program used was the same as 

20 described above for amplification of the gene from genomic DNA. Piasmid DNA was 
isolated from cultures of colonies having the desired insert and was sequenced to confirm 
the lack of nucleotide errors from PCR. 

C. Construction of the pESC-Leu/EIIa/EIIfl vector 

25 Three ng of DNA of the vector pESC-Leu was digested with the restriction 

enzyme Apa I, and the digest was purified using a QIAquick PCR Purification Column. 
The vector DNA was then digested with the restriction enzyme Sal I, and the digest was 
inactivated by heating to 65°C for 20 minutes. The double-digested vector DNA was 
dephosphorylated with shrimp alkaline phosphatase (Roche), separated on a 1% TAE- 

30 agarose gel, and gel purified 
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The nucleic acid encoding the hi elsdenii E2a polypeptide was amplified from 
genomic DNA using the PCR primer pair EIIoAPAF and EIIaaSALR. EIIaAPAF was 
designed to introduce an Apa I restriction site and a translation initiation site at the 
beginning of the amplified fragment The EHaSALR primer was designed to introduce a 
5 Sail restriction site at the end of the amplified fragment and to contain an altered 

translational stop codon to allow in-frame translation of the c-myc epitope. The PCR mix 
contained the following: IX Expand PCR buffer, 1 00 ng M. elsdenii genomic DNA, 0.2 
uM of each primer, 0.2 mM each dNTP, and 5.25 units of Expand DNA Polymerase in a . 
final volume of 100 pL. The PCR reaction was performed in an MJ Research PTC100 
10 under the following conditions: an initial denaturafion at 94°C for 1 minute; 8 cycles of 
94°C for 30 seconds, 55°C for 1 minute, and 72°C for 3 minutes; 24 cycles of 94°C for 30 
secondv62°CTcTTmimiEe;wra 7 minutes 

at 72°C. The amplification product was then separated by gel electrophoresis using a 1% 
-TAE-ag*arae-gelre nd a 1 .3 K fefrggment-was-e x ciscd and puri fi edv The-purified fragment 
15 was digested with Apa I restriction enzyme, purified with a QIAquick PCR Purification 
Column, digested with Sal I restriction enzyme, purified again with a QIAquick PCR 
Purification Column, and quantified on a minigeL 

80 ng of the digested PCR product containing the nucleic acid encoding the hi 
~^feffe»fihE2a^olype ptide and 80 ng of the p repa red pESC ^ L ea-vector^vere ligated using 
20 T4 DNA ligase at 16 9 C for 16 hours. One uL of the ligation reaction was used to 

electroporate 40 uL of E. coli Electromax™ DH10B™ cells. The electroporated cells 
were plated ontoXBC media^Jndiyidual colonies .were .screened.using colony. PCR with 
the EIIaAPAF and EHaSALR primers. Individual colonies were suspended in about 25 
ul of 10 mM Tris, arid 2 uL of the suspension was plated on LBC media. The remnant 
25 suspension was heated for 10 minutes at 95°C to break open the bacterial cells, and 2 uL 
of the heated cells used in a 25 uL PCR reaction. The PCR mix contained the following: 
IX Taq buffer, 0.2 uM each primer, 0.2 mM each dNTP, and 1 unit of Taq DNA 
polymerase per reaction. The PCR program used was the same as described above for 
amplification of the gene from genomic DNA. Plasmid DNA was isolated from cultures 
30 of colonies having the desired insert and was sequenced to confirm the lack of nucleotide 
errors from PCR. 
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Plasmid DNA of a pESC-Leu/EIIa vector with confirmed sequence was used for 
insertion of the nucleic acid encoding theM. elsdeniiE2$ polypeptide. Three |xg of 
plasmid DNA was digested with the restriction enzyme Spe I, and the digest was purified 
using a QIAquick PCR Purification Column, The vector DNA was then digested with the 

5 restriction enzyme Not I and gel purified from a 1% TAE-agarose geL The double- 
digested vector DNA was then dephosphorylated with shrimp alkaline phosphatase 
(Roche) and purified with a QIAquick PCR Purification Column. 

The nucleic acid encoding the M elsdenii E2p polypeptide was amplified from 
genomic DNA using the PCR primer pair EHPNOTF and EDPSPER. The EIipNOTF 

10 primer was designed to introduce a Not I restriction site and a translation initiation site at 
the beginning of the amplified fragment The EIipSPER primer was designed to 
introduce an Spe I restriction site at the end of the amplified fragment and to contain an 
altered translational stop codon to allow for in-frame translation of the FLAG epitope. 
The PCR mix contained the following: IX Expand PCR buffer, 100 ng A£ elsdenii 

15 genomic DNA, 0.2 jiM ofeach primer, 0.2 mM-each dNTP, and 5.25 units of Expand 
DNA Polymerase in a final volume of 100 \iL. The PCR reaction was performed in an 
MJ Research PTC100 under the following conditions: an initial denaturation at 94°C for 1 
minute; 8 cycles of 94°C for 30 seconds, 55°C for 4S seconds, and 72°C for 3 minutes; 24 
cycles of 94°C for 30 seconds, 62°C for 45 seconds, and 72°C for 3 minutes; and a final 

20 extension for 7 minutes at 72°C. The amplification product was separated by gel 

electrophoresis using a 1% TAE-agarose gel, and a 1 .1 Kb fragment was excised and 
purified. The purified fragment was digested with Spe I restriction enzyme, purified with 
a QIAquick PCR Purification Column, digested with Not I restriction enzyme, purified 
again with a QIAquick PCR Purification Column, and quantified on a midget 

25 38 ng of the digested PCR product containing the nucleic acid encoding the A£ 

elsdenii E2p polypeptide and 50 ng of the prepared pESC-Leu/EIIa vector were ligated 
using T4 DNA ligase at 16°C for 16 hours. One \xL of the ligation reaction was used to 
electroporate 40 of £ coli Electromax™ DH10B™ cells. The electroporated cells 
were plated onto LBC plates. Individual colonies were screened using colony PCR with 

30 the EIipNOTF and EIipSPER primers. Individual colonies were suspended in about -25 
jiL of 10 mM Tris, and 2 jiL of the suspension was plated on LBC media. The remnant 
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suspension was heated for 10 minutes at 95°C to break open the bacterial cells, and 2 nL 
of the heated cells was used in a 25 jiL PCR reaction. The PCR mix contained the 
following: IX Taq buffer, 0.2 pM each primer, 0.2 mM each dNTP, and 1 unit of Taq 
DNA polymerase per reaction. The PCR program used was the same as described above 

5 for amplification of the gene from genomic DNA. 

Plasmid DNA was isolated from cultures of colonies having the desired insert and 
was sequenced to confirm the lack of nucleotide errors from PCR. A pESC-Leu/EHa 
/Eflp construct with a confirmed sequence was co-transformed along with the pESC- 
Trp/OS 1 9/EI vector into S. cerevisiae strain YPH500 using a Frozen-EZ Yeast 

10 Transformation II™ Kit (Zymo Research, Orange, CA). Transformation reactions were 
plated on SC-Trp-Leu media. Individual yeast colonies were screened for the presence of 
the OS19, ; El , E2a, and E2{J nucleic acid by colony PCR- Colonies were suspended in 20 
HL of Y-Lysis Buffer (Zymo Research) containing^ units of zymolase and heated at 
•yf°C f ox 10 m inu te s. Three p ir of this suspension was th en used in a 25 pL PCR 

1 5 reaction using the PCR reaction mixtures and programs described for the colony screens 
of the K coli transformants. The pESC-Trp/OS19 and pESC-Leu vectors were also co- 
transformed intoYPHSOO for use as a lactyl-CoA dehydratase assay control. These 

manualvpESG^eastHEpitope*ig^ng Vectors, Statagene>.~ — 

20 

D. Construction of the oESC-His/D-LDH/PCT vector 

Three |ig of DNA^of4he vector pESC-IfoAvas-digested -with the restriction 

enzyme Xho I, and the digest was purified using a QIAquick PCR Purification Column. 

The vector DNA was then digested with the restriction enzyme Apa I and gel purified 
25 from a 1% TAE-agarose gel. The double-digested vector DNA was dephosphorylated 

with shrimp alkaline phosphatase (Roche) and purified using a QIAquick PCR 

PurificationColumn. 

The E. coli D-LDH gene was amplified from genomic DNA of strain DH1 0B 

using the PCR primer pair LDHAPAF and LDHXHOR. LDHAPAF was designed to 
30 introduce an Apa I restriction site and a translation initiation site at the beginning of the 

amplified fragment The LDHXHOR primer was designed to introduce an Xho I 
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restriction site at the end of the amplified fragment and to contain the translational stop 
codon for the D-LDH gene. The PCR mix contained the following: IX Expand PGR 
buffer, 100 ng & coli genomic DNA, 0.2 \M of each primer, 0.2 mM each dNTP, and 
5.25 units of Expand DNA Polymerase in a final volume of 100 ^L. The PCR reaction 

5 was performed in an MJ Research PTC100 under the following conditions: an initial 
denaturation at 94°C for 1 minute; 8 cycles of 94°C for 30 seconds, 59°C for 45 seconds, 
and 72°C for 2 minutes; 24 cycles of 94°C for 30 seconds, 64°C for 45 seconds, and 72°C 
for 2 minutes; and a final extension for 7 minutes at 72°C. The amplification product was 
separated by gel electrophoresis using a 1% TAE-agarose gel, and a 1.0 Kb fragment was 

1 0 excised and purified. The purified fragment was digested with Apa I restriction enzyme, 
purified with a QIAquick PCR Purification Column, digested vnihXho I restriction 
enzyme, purified again with a QIAquick PCR Purification Column, and quantified on a 
minigel. — • 

80 ng of the digested P.CR product containing fee £. coli D-LDH gene and 80 ng 

15 of the prepared pESC-His vector were ligated using T4 DNA ligase at 16°C for 16 hours. 
One of the ligation reaction was used to electroporate 40 |iL of £ coli Electromax™ 
DH1 0B™ cells. The electroporated cells were plated onto LBC media. Individual 
colonies were screened using colony PCR with the LDHAPAF and LDHXHOR primers. 
Individual colonies were suspended in about 25 ^L of 10 mM Tris, and~2 pL of the 

20 suspension was plated on LBC media. The remnant suspension was heated for 1 0 

minutes at 95°C to break open the bacterial cells, and 2 pL of the heated cells used in a 25 
pL PCR reaction. The PCR mix contained the following: IX Taq buffer, 0.2 pM each 
primer, 0.2 mM each dNTP, and 1 unit of Taq DNA polymerase per reaction. The PCR 
program used was the same as described above for amplification of the gene from 

25 genomic DNA. Plasmid DNA was isolated from cultures of colonies having the desired 
insert and was sequenced to confirm the lack of nucleotide errors from PCR 

Plasmid DNA of a pESC-His/D-LDH construct with a confirmed sequence was 
used for insertion of the nucleic acid encoding the M elsdenii PCT polypeptide. Three ng 
of plasmid DNA was digested with the restriction enzyme Pac I, and the digest was 

30 purified using a QIAquick PCR Purification Column. The vector DNA was then digested 
with the restriction enzyme Spe I and gel purified from a 1% TAE-agarose .gel. The 
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double-digested vector DNA was dephosphoryiated with shrimp alkaline phosphatase 
(Roche) and purified with a QIAquick PCR Purification Column. 

The nucleic acid encoding the A£ elsdenii PCT polypeptide was amplified from 
genomic DNA using the PCR primer pair PCTSPEF and PCTP ACR. PCTSPEF was 

5 designed to introduce an Spe I restriction site and a translation initiation site at the 

beginning of the amplified fragment The PCTPACR primer was designed to introduce a 
Pac I restriction site at the end of the amplified fragment and to contain the translational 
stop codon for the PCT gene. The PCR mix contained the following: IX Expand PCR 
buffer, 100 ng M. elsdenii genomic DNA, 0.2 uM of each primer, 0.2 mM each dNTP, 

10 and 5.25 units of Expand DNA Polymerase in a final volume of 100 nL. The PCR 

reaction was performed in an MJ Research PTC1 00 under the following conditions: an 
initial depuration ar94 0 G forl^ 56°C for 45 

seconds, and 72°C for 2.5 minutes; 24 cycles of 94°C for 30 seconds, 64°C for 45 
seconds, and 72°C for 2.5 minutes; and a final extension for 7 minutes at 72°C. The 

15 amplification product was separated by gel electrophoresis using a 1% TAE-agarose gel, 
and a 1 .55 Kb fragment was excised and purified. The purified fragment was digested 
with Pac I restriction enzyme, purified with a QIAquick PCR Purification Column, 
digested with Spe I restriction enzyme, purified again with a QIAquick PCR Purification 
Column, and quantified on a minigel. 

20 95 ng of the digested PCR product containing the nucleic acid encoding the M 

elsdenii PCT polypeptide and 75 ng of the prepared pESC-His/D-LDH vector were 
ligated using T4 DNA ligase at 16°C for 16 hours. One wL of the ligation reaction was 
used to electroporate 40 uL ofE. coli Electromax™ DH10B™ cells. The electroporated 
cells were plated onto LBC plates. Individual colonies were screened using colony PCR 

25 with the PCTSPEF and PCTPACR primers. Individual colonies were suspended in about 
25 uL of 10 mM Tris, and 2 uL of the suspension was plated on LBC media. The 
remnant suspension was heated for 10 minutes at 95°C to break open the bacterial cells, 
and 2 uL of the heated cells used in a 25 uL PCR reaction. The PCR mix contained the 
following: IX Taq buffer, 0.2 uM each primer, 0.2 mM each dNTP, and 1 unit of Taq 

30 DNA polymerase per reaction. The PCR program used was the same as described above 
for amplification of the"gene from genomic DNA. 
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Plasmid DNA was isolated from cultures of colonies having the desired insert and 
was sequenced to confirm the lack of nucleotide mors from PCR. A construct with a 
confirmed sequence was transformed into S. cerevisiae strain YPH500 using a Frozen-EZ 
Yeast Transformation II™ Kit (Zymo Research, Orange, CA). Transformation reactions 

5 were plated on SC-His media. Individual yeast colonies were screened for the presence 
of the D-LDH and PCT genes by colony PCR. Colonies were suspended in 20 |iL of Y- 
Lysis Buffer (Zymq Research) containing^ i^ts of zymolase and heated at 37°C for 10 
minutes. Three \iL of this suspension was then used in a 25 nL PCR reaction using the 
PCR reaction mixture and program described for the colony screen of the E. coli 

10 transformants. The pESC-His vector was also transformed into YPH500 for use as an 

assay control, and transformants were screened by PCR using GAL1 and GAL 10 primers. 

Example 13 - Expression of Enzymes in S. cerevisiae 

A=r=^^^^^ J^^ ^^^^^ ^^ l _ 

15 Individual colonies carrying the pESC-Trp/OS19 construct or the pESC-Trp 

vector (negative control) were used to inoculate 5 mL cultures of SC-Trp media 
containing 2% glucose. These cultures were grown for 16 hours at 30°C and used to 
inoculate 35 mLofthe same media. The subcultures were grown for 7 hours at 30°C, 
and their ODmoS were determined A volume of cells giving an OD x volume equal to 40 

20 was pelleted, washed with SC-Trp media with no carbon source, and repelleted. The cells 
were suspended in 5 mL of SC-Trp media containing 2% galactose and used to inoculate 
a total volume of 100 mL of this media. Cultures were grown for 17.5 hours at 30°C and 
250 rpm. Cells were then pelleted, rinsed in 0.85% NaCl, and repelleted. Cell pellets (70 
mg) were suspended in 140 \xL of 50 mM TrisHCl, pH 7.5, and an equal volume (pellet 

25 plus buffer) of pre-rinsed glass beads (Sigma, 150-212 microns) was added. This mixture 
was vortexed for 30 seconds and placed on ice for 1 minute, and the vortexing/cooling 
cycle was repeated 8 additional times. The cells were then centrifuged for 6 minutes at 
5,000g, and the supernatant was removed to a fresh tube. The beads/pellet were washed 
twice with 250 41L of buffer, centrifuged, and the supematants joined with the first 

30 supernatant 
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An E. coli strain carrying the pETBlue-l/OS 1 9 construct, described previously, 
was used as a positive controller hydratase assays. A culture of this strain was grown to 
saturation overnight and diluted 1 :20 the following morning in fresh LBC media. The 
culture was grown at 37°C and 250 rpm to anODsoo of 0.6, at which point it was induced 

5 with IPTG at a final concentration of 1 mM. The culture was incubated for an additional 
two hours at 37°C and 250 rpm. Cells were pelleted, washed with 0.85 % NaCl, and 
repelleted. Cells were disrupted using BugBuster™ Protein Extraction Reagent and 
Benzonase® (Novagen) as per manufacturer's instructions with a 20 minute incubation at 
room temperature. After centrifugation at 1 6,000g and 4°C, the supernatant was 

10 transferred to a new tube and used in the activity assay. 

Total protein content of cell extracts from S. cerevisiae described above were 
quantified using a microplate Bio-Rad Protein Assay (Bio-Rad, Hercules, CA). The 
OS 1 9 constructs (both Apa l-Sal I and Apa l-Kpn I-sites) in YPH500, the pESC-Trp 
negative control in YPH500, and the pETBlue-l/OS19-construct in E. coli were tested for 

15 their ability to convert acrylyl-CoA to 3-hydroxypropionyl-CoA. The assay was 
conducted as previously described for the pETBlue-l/OS19 constructs in the £. coli 
Tuner strain. When cell extract of the negative control strain was added to the reaction 
roixture.cpntaining ac^!yL-CoA,.pne dominantjpeak.ofJ\^.823 was exhibited. This 
peak corresponds to acrylyl-CoA and indicates that acrylyl-CoA was not converted to any 

20 other product When cell extract of the strain carrying a pESC-Trp/OS19 construct 

(either Apa l-Sal I or Apa l-Kpn I sites) was added to the reaction mix, the dominant peak 
shifted to MW 841, which corresponds to 3 -hydroxypropionyl-Co A The reaction mix 
from the E coli control also showed the MW 841 peak. A time course study was 
conducted for the pESC-Trp/OS19(4pa l-Sal I) construct, which measured die 

25 appearance of the MW 841 and MW 823 peaks after 0, 1, 3, 7, 15, 30, and 60 minutes of 
reaction time. An increase in the 3-hydroxypropionyl-CoA peak was seen over time with 
the cell extracts from both this construct and the E. coli control, whereas cell extract from 
the YPH500 strain with vector only showed a dominant acrylyl-C6A peak. . 
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Propionvl CoA-Transferase Activity in Transfonned Yeast 
Individual colonies of £ cerevisiae strain YPH500 carrying the pESC-His/D-LDH 
or pESC-ffis/D-LDH/PCT construct or the pESC-His vector with no insert (negative 
control) were used to inoculate 5 mL cultures of SC-His media containing 2% glucose. 

5 These cultures were grown for 16 hours at 30°C and 250 rpmand were then used to 
inoculate 35 mL of the same media. The subcultures were grown for 7 hours at 30°C, 
and their ODeoos were determined. For each strain, a volume of cells giving an OD x 
volume equal to 40 was pelleted, washed with SC-His media with no carbon source, and 
repelleted. The cells were suspended in 5 mL of SC-His media containing 2% galactose 

10 and used to inoculate a total volume of 100 mL of this media. Cultures were grown for 
16.75 hours at 30°C and 250 rpm. Cells were then pelleted, rinsed in 0.85% NaCl, and 
repelleted. Cell pellets (70 mg) were suspended in 140 jiL of 100 mM potassiuin 
phosphate buffer, pH 7.5, and an equal volume (pellet plus buffer) of pre-rinsed glass 
beads (Sigma, 150-212 microns) was added. This mixture was vortexed for 30 seconds 

15 and placed on ice for 1 minute, and the vortexing/cooling cycle was repeated 8 additional 
times. The cells were then centrifuged for 6 minutes at 5,000g, and die supernatant was 
removed to a fresh tube. The beads/pellet were washed twice with 250 jiL of buffer and 
centrifuged, and the supernatants joined with the first supernatant 

An K colt strain carrying the pETBlue-l/PCT construct, described previously, 

20 was used as a positive control for propionyl Co A transferase assays. A culture of this 
strain was grown to saturation overnight and diluted 1 :20 the following morning in fresh 
LB media containing 100 jxg/mL of carbenicillin. The culture was grown at 37°C and 
250 rpm to an OD*oo of 0.6, at which point it was induced with IPTG at a final 
concentration of 1 mM. The culture was incubated for an additional two hours at 37°C 

25 and250ipm. Cells were pelleted, washed with 0.85 % NaCl, and repelleted. Cells were 
disrupted using BugBuster™ Protein Extraction Reagent and Benzonase® (Novagen) as 
per manufacturer's instructions with a 20 minute incubation at room temperature. After 
centrifugation at I6,000g and 4°C, the supernatant was transferred to a new tube and used 
in the activity assay. 

30 Total protein content of cell extracts was quantified using a microplate Bio-Had 

Protein Assay (Bio-Rad, Hercules, CA). The D-LDH and D-LDH/PCT constructs in SL 
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cerevisiae strain YPH500, the pESC-His negative control in YPH500, and the pETBlue- 
1/PCT construct in E. coli were tested for their ability to catalyze the conversion of 
propionyl-CoA and acetate to acetyl-CoA and propionate. The assay mixture used was 
that previously described for the pETBlue-l/PCT constructs in the E. coli Tuner strain. 

5 When 1 \ig of total cell extract protein of the negative control strain or the 

YPH500/pESC-His/D-LDH strain was added to the reaction mixture, no increase in 
absorbance (0.00 to 0.00) was seen over 1 1 minutes. Increases in absorbance from 0.00 
to 0.04 and from 0.00 to 0.06 were seen, respectively, with 1 M-g of cell extract protein 
from the YPH500/pESC-ffis/D-LDH/PCT strain and the E. colifPCT strain- With 2 mg 

10 of total cell extract protein, the negative control strain and the YPH500/pESC-His/D- 

LDH strain showed an increase in absorbance from 0.00 to 0.01 over 1 1 minutes, whereas 
increases from 0.00 to 0.10 and 0.00 to 0.08 were seen, respectively, with the 
YPH500/pESC-His/ D-LDH /PCT strain and the E* coli/PCT strain. 

15 Lactvl-CoA Dehydratase Activity in Transformed Yeast 

Individual colonies of S. cerevisiae strain YPH500 carrying the pESC-His/D-LDH 
or pESC-His/D-LDH/PCT construct or the pESC-His vector with no insert <negative 
control) were use d to inoculate 5 mL cultures of SC -His media containmg^ 
Jl£Sfi_cullu^ hours at 30°C and used to inoculate 35 mL of SC-His 

20 media containing 2 % raflinose. The subcultures were grown for 8 hours at 30°C, and 
their OD600S were determined. For each strain, a volume of cells giving an OD x volume 
equal to 40 was pellet^^resfuspended Jp LO mL. of SC-His conj^i^ 2% 
galactose, and used to inoculate a total volume of 100 mL of this media. Cultures were 
grown for 17 hours at 30°C and 250 ipm. Cells were then pelleted, rinsed in 0.85% NaCl, 

25 and repelleted. Cell pellets (190 mg) were suspended in 380 nL of 100 mM potassium 
phosphate buffer, pH 7.5, and an equal volume (pellet plus buffer) of pie-rinsed glass 
beads (Sigma, 150-212 microns) was added. This mixture was vortexed for 30 seconds 
and placed on ice for 1 minute, and the vortexing/cooling cycle was repeated 7 additional 
times. The cells were then centrifuged for 6 minutes at 5,000 g and the supernatant was 

30 removed to a fresh tube. The beads/pellet were washed twice with 300 pL of buffer and 
centrifuged, and the supernatants joined with the first supernatant 
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An anaerobically-grown cultoe of E. voli stain DH1 OB was used as a positive 
control for D-LDH assays. A culture of this strain was grown to saturation overnight and 
diluted 1 :20 the following morning in fresh LB media. The culture was grown 
anaerobically at 37°C for 7.5 hours. Cells were pelleted, washed with 0.85 % NaCl, and 
5 repelleted. Cells were disrupted using BugBuster™ Protein Extraction Reagent and 

Benzonase® (Novagen) as per manufacturer's instructions with a 20-minute incubation at 
room temperature. After centriftigation at 16,000g and 4°C, the supernatant was 
transferred to a new tube and used in the activity assay. 

Total protein content of cell extracts was quantified using a micropiate Bio-Rad 

10 Protein Assay (Bio-Rad, Hercules, CA). The D-LDH and D-LDH/PCT constructs in 

YPH500, the pESC-His negative control in YPH500, and the anaerobically-growni:. coli 
strain were tested for their .ability to catalyze the conversion of pyruvate to lactate by 
assaying the concurrent oxidation of NADH to NAD. The assay mixture contained 100 
mM potassium phosphate buffer, pH 7.5, 0.2 mMNADH, and 0.5.-1.0 jig of cell extract 

15 The reaction was started by the addition of sodium pyruvate to a final concentration of 5 
mM, and the decrease in absorbance at 340 nm was measured over 10 minutes. When 0.5 
\x% of total cell extract protein of the negative control strain was added to the reaction 
mixture, a decrease in absorbance from -0.01 to -0.02 was seen over 200 seconds. A 
decrease in absorbance from -0.21 to -0.47 and -0.20 to -0.47 over 200 seconds was 

20 seen, respectively, for cell extract from the YPH500/pESC-His/D-LDH or 

YPHSOO/pESC-His/D-LDH/PCT strains. 0.5 pL (7.85 \ig) of cell extract from the 
anaerobically-grown R coli strain showed a decrease in absorbance very similar to that 
for 1 ng of cell extract of the YPH500/pESC-His/D-LDH/PCT strain. When 4 \ig of cell 
extract was used, the YPH500/pESC-His/D-LDH/PCT strain showed a decrease in 

25 absorbance from -0. 1 8 to -0.60 over 1 0 minutes, whereas the negative control strain 
showed no decrease in absorbance {-0.08 to -0.08). 

D. Demonstration of 3 -HP production in S. cerevisiae 

The pESC-Trp/OS19/EI, pESC-Leu/EHa/EDB, and pESC-His/D-LDH/PCT 
30 constructs were transformed into a single strain of S. cerevisiae YPH500 using a Frozen- 
EZ Yeast Transformation II™ Kit (Zymo Research, Orange, CA). A negative control 
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strain was also developed by transformation of the pESC-Trp, pESC-Leu, and pESC-His 
vectors into a single YPH500 strain. Transformation reactions were plated on SC-Trp- 
Leu-His media. Individual yeast colonies were screened by colony PCR for the presence 
or absence of nucleic acid corresponding to each construct 
5 The strain carrying all six genes and the negative control strain were grown in 5 

mL of SC-Trp-Leu-His media containing 2% glucose. These cultures were grown for 31 
hours at 30°C, and 2 mL v^ use^lo inoculate 50 mL of the s media. The 
subcultures were grown for 1 9 hours at 30°C, and their OD600s were (tetermined. For 
each strain, a volume of cells giving an OD x volume equal to 100 was pelleted, washed 
10 with SC-Trp-Leu-His media with no carbon source, and repelkted. The cells were 

suspended in 10 mL of SC-Trp-Leu-His mecha containing 2% galactose and 2% raffinose 
and used to inoculate a total volume of 250 mL of this media. The cultures were grown 
in bottles at 30°C with no shaking, and samples were taken at 0, 4.5, 20, 28,5, 45, and 70 
— ho u r s . S ampies-were^pim^own-to remove-cells ^nd-tfae-supernatant was filtered using 
15 0.45 micron Acrodisc Syrige filters <Pall<jelman Laboratory, Ann Arbor, MI). 

100 microliters of the filtered broth was used to derive CoA esters of any lactate 
or 3-HP in the broth using 6 micrograms of purified propionyl-CoA transferase, 50 mM 
pnt^rinm phosphate buffer /pHL2i£Ljmd 1 _inM_aoetyl-CoA, The reaction was allowed 
-tcrpreeeed^room-temperature^e^^ 10% 
20 -trifluoroacetic acid. The reaction mixtures were purified using Sep Pak CI 8 columns as 
previously described and analyzed by LC/MS. 

Example 14 Constructing a Biosynthetic Pathway that 
Produces Organic Acids from B-alanine 
25 One possible pathway to 3-HP fiom p-alanine involves die use of a polypeptide 

having CoA transferase activity (e.g., an enzyme from a class of enzymes that transfers a 
CoA group from one metabolite to the other). As shown in Figure 54, Marine can be 
converted to p-alanyl-CoA using a polypeptide having CoA transferase activity and CoA 
donors such as acetyl-CoA or propionyl-CoA Alternatively, Manyl-CoAcan be 
30 generated by the action of a polypeptide having CoA synthetase activity. The p*-alanyl- 
CoA can be deaminated to form acrylyl-CoA by a polypeptide having p-alanyl-CoA 
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ammonia lyase activity. The hydration of acrylyl-CoA at the p position to yield 3-HP- . 
CoA can be carried out by a polypeptide having 3-HP-CoA dehydratase activity. The 3- 
HP-CoA can act as a CoA donor for P-alanine, a reaction that can be catalyzed a 
polypeptide having CoA transferase activity, thus yielding 3-HP as a product 
5 Alternatively, 3-HP-CoA can be hydrolyzed to yield 3-HP by a polypeptide having 
specific CoA hydrolase activity. 

Methods for isolating, sequencing, expressing, and testing the activity of a 
polypeptide havirig CoA transferase activity are described herein. 

10 A. Isolation of a polypeptide having B-alanvl-CoA Ammonia Lyase Activity 
Polypeptides having 0-alanyl-CoA ammonia lyase activity can catalyze the 
conversion of fi-alanyl-CoA into acryly-CoA. The activity of such polypeptides has been 
described by Vagelos et al (J. Biol Chem., 234:490-497 (1959)) in Clostridium 
propionicum. This polypeptide can be used as part of the aciylate pathway in Clostridium 

1 5 propionicum to produce propionic acid. 

C propionicum was grown at 37°C in an anoxic medium containing 02% yeast 
extract, 0.2% trypticase peptone, 0.05% cysteine, 0.5% b-alanine, 0.4% VRB-salts, 5 mM 
potassium phosphate, pH 7.0. The cells were harvested after 12 hours and washed twice 
with 50 mM potassium phosphate <Kpi), pH 7.0. About 2 g of wet packed cells were re- 

20 suspended in 40 mL of Kpi, pH 7.0, ImM MgCfc, 1 mM EDTA, and 1 mM DTT (Buffer 
A), and homogenized by sonication at about 85-100 W power using a 3mm tip (Branson 
sonifier 250). Cell debris was removed by centrifugation at 100,000g for 45 minutes in a 
Centncon T-1080 Ultra centrifuge, and the cell free extract ( ~ 1 10 U/mg activity) was 
subjected to anion exchange chromatography on Source-15Q-material. The Source- tSQ 

25 column was loaded with 32 mL of cell free extract The column was developed by a 

linear gradient of 0 M to 0.5 M NaCl within 10 column volumes. The polypeptide eluted 
between 70-1 10 mM NaCl. 

The solution was adjusted to a final concentration of 1 M (NH^SO^ and applied 
onto a Resource-Phe column equilibrated with 1 M (NH^SC^ in buffer A. The 

30 polypeptide did not bind to this column. 



119 



BNSOOQD: <WO 02424 18A2J_> 



WO 02/42418 



PCT/US01/43607 



The final preparation was obtained after concentration in an Aroicon chamber 
(filter cut-off 30 kDa). The functional polypeptide is composed of four polypeptide sub- 
units, each haying a molecular mass of 16 kDa. The polypeptide had a final specific 
activity of 1033 U/mg in the standard assay (see below). 

5 The polypeptide sample after every purification step was separated on a 15% 

SDS-PAGE gel. The gel was stained with 0.1% Coomassie R 250, and the destaining 
was achieved by using 7.1% acetic acid/5% ethanol solution. 

The polypeptide was desalted by RP-HPLC and subjected to N-terminal 
sequencing by gas phase Edman degradation. The results of mis analysis yielded a 35 

1 0 amino acid N-terminal sequence of the polypeptide. The sequence was as follows: MV- 
GKKWHHLMMSAKDAHYTGNLVNGARTVNQWGD <SEQ IDNO:177). 

B. Amplification of a Gene Fragment 

The 35 amino acid sequence of the polypeptide having P-alanine-CoA ammonia 
1 5 lyase activity was used to design primers with which to amplify the corresponding DNA 
from genome of C. propionicium. Genomic DNA from C. propionicum was isolated 
using the Gentra Genomic DNA isolation Kit (Gentra Systems, Minneapolis) following 
the genomic DNA protocol for gram-positive bacteria. A codon usage table for 
. Clostridium propionicum was used to back translate Ihe seven amino acids on either end 
20 of the amino acid sequence to obtain 20-nucleotide degenerate primers: 

ACLF: 5 ' - ATGGTWGG Y AARAARGTWGT -3' (SEQ ID NO:178) 
ACLR: 5'- TCRCCCCAYTGRTTWACRAT -3'(SEQ ID NO:179) 
The primers were used in a 50 uL PCR reaction containing IX Taq PCR buffer, 
0.6 uM each primer, 0.2 mM each dNTP, 2 units of Taq DNA polymerase (Roche 
25 Molecular Biochemicals, Indianapolis, IN). 2.5% (v/v) DMSO, and 100 ng of genomic 
DNA. PCR was conducted using a touchdown PCR program with 4 cycles at an 
annealing temperature of 58°C, 4 cycles at 56°C, 4 cycles at 54°C, and 24 cycles ar52°C. 
Each cycle used an initial 30 second denaturing step at 94°C and a 125 minute extension 
at 72°C, and the program had an initial denaturation step at 94°C for 2 minutes and final 
30 extension at 72°C for 5 minutes. The amounts of PCR primer used in the reaction were 
increased three-fold above typical PCR amounts due to the amount of degeneracy in the 
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10 



15 



20 



25 



3* end of the primer, in addition, separate PCR reactions containing^each individual 
primer were made to identify PCR product resulting from single degenerate primers. 
Twenty jxL of each PCR product was separated on a 2.0% TAE (Tris-acetate-EDTA)- 
agarosegel. 

A band of about 1 00 bp was produced by the reaction containing both the forward 
and reverse primers, but was not present in the individual forward and reverse primer 
control reactions. This fragment was excised and purified using a QIAquick Gel 
Extraction Kit (Qiagen, Valencia, CA). Four microliters of the purified band was ligated 
into pCRH-TOPO vector and transformed by a heat-shock method into TOP 10 K coli 
cells using a TOPO cloning procedure (Invitrogen, Carlsbad, CA). Transformations 
were plated on LB media containing 50 (ig/mL of kanamycin and 50 ng/mL of 5-Bromo- 
4-Chloro-3-Indolyl-B-D-Galactopyranoside (X-gal). Individual, white colonies were 
resuspended in 25 nL of 10 mM Tris and heated for 10 minutes at 95°C to break open the 
bacterial cells. Two microliters of the heated cells were used in a 25 PCR reaction 
using Ml 3R and Ml 3F universal primers homologous to the pCRH-TOPO vector. The 
PCR mix contained the following: IX Taq PCR buffer, 0.2 \M each primer, 0.2 mM each 
dNTP, and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was 
performed in a MJ Research PTC1 00 under the following conditions: an initial 
denaturation at 94°C for 2 minutes; 30 cycles of 94°C for 30 seconds, 52°C for 1 minute, 
and 72°C for 1 .25 minutes; and a final extension for 7 minutes at 72°C. Plasmid DNA 
was obtained {QIAprep Spin Miniprep Kit, Qiagen) from cultures of colonies showing the 
desired insert and was used for DNA sequencing with M13R universal primer. The 
following nucleic acid sequence was internal to the degenerate primers and corresponds 
to a portion of the 35 amino acid residue sequence: S'-ACATCATTTAATGATGA- 
GCGCAAAAGATGCTCACTATACTGGAAACTTAGTAAACGGGGCTAGA-3 * 
(SEQIDNO:180). 

C. Genome Walking to Obtain the Complete Coding Sequence 

Primers for conducting genome walking in both upstream and downstream 
directions were designed using the portion of the nucleic acid sequence that was internal 
to the degenerate primers. The primer sequences were as follows: 
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ACLGSP1F: 5 * -GTAC ATCATTTAATGATGAGCGCAAAAGATG-3 ' (SEQ ID 
NO:181) 

ACLGSP2F:. 5 ' -G ATGCTC ACTATACTGGAAACTTAGTAAAC-3 ' (SEQ ID 
NO:l82) 

5 ACLGSP1R: 5'-ATTCTAGCGCCGTTTACTAAGTTTCCAG-3' (SEQ ID NO:183) 
ACLGSP2R: 5 ' -CCAGTATAGTGAGC ATCTTTTGGGCTC ATC-3 * (SEQ ID NO: 184) 

GSP1F and GSP2F are primers facing downstream, GSP1R and GSP2R are 
primers facing upstream, and GSP2F and GSP2R are primers nested inside GSP1F and 

10 GSP1R, respectively. Genome walking libraries were constructed according to the 

manual for CLONTECH's Universal Genome Walking Kit<CLONTECH Laboratories, 
Palo Alto, CA), with the exception that the restriction enzymes Ssp I and /fine II were 
used in addition to Dra I, EcoR V, and Pvu U. PGR was conducted in a Perkin Elmer 
9700 Thermocycler using the following reaction mix: IX XL Buffer n, 0.2 mM each 

1 5 dNTP, 1 .25 mM Mg(OAc)2 , 0.2 \M each primer, 2 units of rTth DNA polymerase XL 
(Applied Biosystems, Foster City, CA), and 1 nL of library per 50 |iL reaction. First 
round PCR used an initial denaturation at 94°C for 5 seconds; 7 cycles consisting of 2 sec 
at 94°C and 3 min at 70°C; 32 cycles consisting of 2 sec at 94°C and 3 min at 64°C; and a 
final extension at 64°C for 4 min. Second round PCR used an initial denaturation at 94°C 

20 for 15 seconds; 5 cycles consisting of 5 sec at 94°C and 3 min at 70°C; 26 cycles 

consisting of 5 sec at 94°C and 3 min at 64°C; and a final extension at 66°C for 7 min. 
Twenty jiL of each first and second round product was run on a 1 .0% TAE-agarose gel. 
in the second round PCR for the forward reactions, a 1 .4 Kb band was obtained for Dra I, 
a 1 .5 Kb band for Hinc H, a 4.0 Kb band for Pvu H, and 2.0 and 2.6 Kb bands were 

25 obtained for Ssp I. In the second round PCR for the reverse reactions, a 1.5 Kb band was 
obtained for Dra I, a 0.8 Kb band for EcoR V, a 2.0 Kb band for Hinc n, a 2.9 Kb band 
for Pvu II, and a 1.5 Kb band was obtained for Ssp I. Several of these fragments were gel 
purified, cloned, and sequenced. 

The coding sequence of the polypeptide having p-alanyl-CoA ammonia lyase 

30 activity is set forth in SEQ ID NO:162. This coding sequence encodes the amino acid 
sequence set forth in SEQ ID NO: 160. The coding sequence was cloned and expressed in 
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bacterial cells. A polypeptide with the expected size was isolated and tested for 
enzymatic activity. 

The isolation of a nucleic acid molecule encoding a polypeptide having 3-HP- 
CoA dehydratase activity <e.g., the seventh enzymatic activity in Figure 54, which can be 

5 accomplished with a polypeptide having the amino acid sequence set forth in SEQ ID 
NO:41) is described herein. This polypeptide in combination with a polypeptide having 
CoA transferase activity (e.g., a polypeptide having the amino acid sequence set forth in 
SEQ ID NO:2) and a polypeptide having p-alanyl-CoA ammonia lyase activity (e.g., a 
polypeptide having the amino acid sequence set forth in SEQ ID NO: 160) provides one 

10 method of making 3-HP from p-alanine. 

Example 15 Constructing a Biosvnthetic Pathway that 
Produces Organic Acids from B-alanine 
In another pathway, p-alanine generated from aspartate can be deaminated by a 
15 polypeptide having 4, 4-aminobutyrate aminotransferase activity (Figure 55). This 
reaction also can regenerate glutamate that is consumed in the formation of aspartate. 
The deamination of P-alanine can yield malonate semialdehyde, which can be further 
reduced to 3-HP by a polypeptide having 3-hydroxypropionate dehydrogenase activity or 
a polypeptide having 3-hydroxyisobutyrate dehydrogenase activity. Such polypeptides 
... 20 can be obtained as follows. 

A. Cloning gabT ( 4-aminobutyrate aminotransferase) from C acetobutvcilicum 

The following PCR primers were designed based on a published sequence for a 
gabT gene from Clostridium acetobutycilicumiGsr&ziM AE007654): 

25 

Cac aba nco sen: 5 '-G AGCC ATGG AAG AAATAAATGCTAAAG- 3* (SEQ ID NO:185) 
Cacababamanti: S'-AGAGGATGGCTTTTTAAATCGCTATTC- 3* (SEQ ID NO: 186) 

The primers introduced a Mrol site at the 5 f end and a BamHl site at the 3' end. A 
30 PCR reaction was set up using chromosomal DNA from C. acetobutylicum as the 
template. 
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H20 80.75 pL PCR Program 

Taq Plus Long lOx Buffer 10 uL 94° C 5 minutes ' 

dNTP mix <10 mM) 3 uL 25 cycles of: 

Cacabancosen(20mM) 2 uL 94° C 30 seconds 

Cacababamanti(20mM) 2uL 50° C 30 seconds 

C acetobutylicumDNA(~100ng) luL 72° C 80 seconds + 2 

Taq Plus Long <5 U/mL) 1 uL seconds/cycle 

Pfo (2.5 U/mL) 025 uL 1 cycle of : 

68° C 7 minutes 
— - - 4°G until use 



io ~ ~ " 

Upon agarose gel analysis a single band was observed of ~1 .3 Kb in size. This 
fragment was purified using QIAquick PCR purification kit (Qiagen, Valencia, CA) and 
cloned into pCRH TOPO using the TOPO Zero Blunt PCR cloning kit (Invitrogen, 
1 5 Carlsbad, CA). 1 uL of the pCRE TOPO ligation mix was used to transform chemically 
competent TOP 10 E. coli cells. The cells were for 1 hour in SOC media, and the 
transformants were selected on LB/kanamycin (SO^mL) plates. Single colonies of the 
transfonnant grownovernight in LB/kanamycin media; and the plasmid DNA was 
extracted using a Mini prep kit (Qiagen, Valencia, CA). The super-coiled plasmid DNA 
20 was separated on a 1% agarose gel digested, and the colonies with insert were selected. 
The insert was sequenced to confirm the sequence and its quality. 

The plasmid having the correct insert was digested with restriction enzyme Nco I 
~ and BamH I. The digested insert was gel isolated and ligated to pET28b expression 
vector that was also restricted with Nco I and BamHl enzymes. 1 ul of ligation mix was 
25 used to transform chemically competent TOP10 & coli cells. The cells were recovered 
for 1 hour in SOC media, and the transformants were selected on LB/kanamycin (50 
ug/mL) plates. The super-coiled plasmid DNA was separated on a 1% agarose gel 
digested, and the colonies with insert were selected. The plasmid with the insert was 
isolated using a Mini prep kit<Qiagen, Valencia, CA), and 1 uL of this plasmid DNA was 
30 used to transform electrocompetent BL21<DE3) (Novagen, Madison, WI). These cells 
were used to check the expression of a polypeptide having 4-aminobutyrate 
amino transferase activity. 
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TV Cloning mmsB 13-hvdroxvisobutyrate dehydrogenase^ from P. aeruginosa 

The following PCR primers was designed based on a published sequence for a 
mmsB gene from Pseudomona aeruginosa (GenBank# M8491 1): 
Ppu hid nde sen: 5 *-ATACATATGACCGACCGACATCGCATT-3 * (SEQ ID NO:186) 
5 Ppu hid sal anti: 5 '-ATAGTCGACGGGTCAGTCCTTGCCXjGG-3 * {SEQ ID NO: 187) 



The primers introduced a Nde I site at the 5' end and a BamH I site at the 3 1 end. 



H 2 0 


80.75 uL 


PCR Program 


Taq Plus Long lOx Buffer 


10 uL 


94° C 5 minutes 


dNTPmix(lOmM) 


3uL 


25 cycles of: 
94° C 30 seconds 
55^0 30 seconds 
72°C 90 seconds + 2 

seconds/cycle 


Ppu hid nde sen (20 uM) 1 


2uL 


68°C 7 minutes 


Ppu hid sal anti (20 uM) ' 


2nL 


4° C until use 


C. acetobutylicum DNA<~100 ng) 


lul 




Taq Plus Long (Stratagene, La Jolla, CA) 


1 uL 




Pfu (Stratagene, La Jolla, CA) 


0.25 uL 





A PCR reaction was set up using chromosomal DNA from P. aeruginosa as the 
1 0 template. Chromosomal DNA was obtained from ATCC (Manassas, VA) P. aeruginosa 
17933D. 

Upon agarose gel analysis, a single band was observed of ~1 .6 Kb in si2e. This 
fragment was purified using QIAquick PCR purification kit (Qiagen, Valencia, CA) and 
cloned into pCRII TOPO using the TOPO Zero Blunt PCR cloning kit (Invitrogen, 
15 Carlsbad, CA). 1 pL of the pCRII TOPO ligation mix was used to transform chemically 
competent TOP10 £ coli cells. The cells were recovered for 1 hour inSOC media, and 
the transfonnants were selected on LB/kanamycin<50 |ig/mL) plates. Single colonies of 
the transformant grown overnight in LB/kanamycin media, and the plasmid DNA was 
extracted using a Mini prep kit (Qiagen, Valencia, CA). The super-coiled plasmid DNA 
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was separated on a 1% agarose gel and digested, and the colonies with insert were 
selected. The insert was sequenced to confirm die sequence and its quality. 

The plasmid having the correct insert was digested with restriction enzyme Nde I 
and BamHl. The digested insert was gel isolated and ligated to pET30a expression vector 
5 that was also restricted with Nde I and BamH I enzymes. 1 uL of ligation mix was used 
to transform chemically competent TOP10 K coli cells. The cells were recovered for 1 
hour in SOC media, and the transformants were selected on LB/kanamycin (50 ug/mL) 
plates. The super-coiled plasmid DNA was separated on a 1 % agarose gel and digested, 
and the colonies with insert were selected. The plasmid with the insert was isolated using 
10 a Mini prep kit (Qiagen, Valencia, CA), and 1 ul of this plasmid DNA was used to 

transform electrocompetent BL21(DE3) (Novagen, Madison, WI). These cells were used 
to check the expression of a polypeptide having 3-hydroxvisobutyrate dehydrogenase 
activity. 

15 OTHER EMBODIMENTS 

It is to be understood that while the invention has been described in conjunction 
with the detailed description thereof, the foregoing description is intended to illustrate and 
not limit the scope of the invention, which is defined by the scope of the appended claims. 
Other aspects, advantages, and modifications are within toe scope of the following 
20 claims. 
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WHAT IS CLAIMED IS: 

1 . A cell comprising lactyl-CoA dehydratase activity and 3-hydroxypropionyl-CoA 
dehydratase activity. 

5 

2. The cell of claim 1, wherein said cell comprises an activity selected from the 
group consisting of El activator activity, E2 a activity, and E2 p activity. 

3. The cell of claim 1, wherein said cell comprises 3-hydroxypropionyl-CoA 
10 dehydratase activity. 

4. The cell of claim 1 , wherein said cell comprises CoA transferase activity. 

5. The cell of claim 1, wherein said cell comprises an exogenous nucleic acid 
IS comprising: 

(a) a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 163; or 

(b) a nucleic acid sequence that shares at leasts percent sequence identity with a 
sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, 

20 or 163. 

6. The cell of claim 1 , wherein said cell comprises 3-hydroxypropionyl-CoA 
hydrolase activity or 3-hydroxyisobutryl-CoA hydrolase activity. 

25 7. The cell of claim 1, wherein said cell comprises lipase activity. 

8. The cell of claim 1 , wherein said cell produces 3 -HP. 

9. The cell of claim 1 , wherein said cell produces an ester of 3-HP. 

30 



127 



BNSOOCID: <WO 02424 16A2L_I_> 



WO 02/42418 



PCT/US01/43607 



10. The cell of ciaim 9, wherein said ester is selected from the group consisting of 
methyl 3-hydroxypropionate, ethyl 3-hydroxypropionate, propyl 3-hydroxypropionate, 
butyl 3-hydroxypropionate, and 2-ethylhexyl 3-hydroxypropionate. 

5 11. Thex:ell of claim 1, wherein said cell comprises CoA synthetase activity. 

12. The cell of claim 1 , wherein said cell comprises poly hydroxyacid synthase 
activity. 

10 13. The cell of claim 1, wherein said cell produces polymerized 3-HP. 

1 4. The cell of claim 1 , wherein said cell is prokaryotic. 

15. The cell of claim 1, wherein said cell is selected from the group consisting of 
15 yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

16. A cell comprising CoA synthetase activity, lactyl-CoA dehydratase activity, and 
poly hydroxyacid synthase activity. 

20 17. The cell of claim 1 6, wherein said cell comprises an activity selected from the 
group consisting of El activator activity, E2 a activity, and E2 p activity. 

18. The cell of claim 16, wherein the cell produces polymerized acryiate. 

25 19. Hie cell of claim 16, wherein said cell is prokaryotic. 

20. The cell of claim 16, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

30 21. A cell comprising CoA transferase activity, lactyl-CoA dehydratase activity, and 
lipase activity. 
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22. The cell of claim 21 , wherein said cell comprises an activity selected from the 

V 

group consisting of El activator activity, E2 a activity, and E2 P activity. 

5 23 . The cell of claim 21 , wherein said cell produces an ester of acrylate. 

24. The cell of claim 23, wherein said ester is selected from. the group consisting of 
methyl acrylate, ethyl acrylate, propyl acrylate, and butyl acrylate. 

10 25. The cell of claim 21, wherein said cell is prokaryotic. 

26. The cell of claim 21 , wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

15 27. An polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 

161; 

(b) a sequence having at least 10 contiguous amino acid residues of a sequence set 
20 forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161; 

(c) a sequence that has at least 65 percent sequence identity with a sequence set 
forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161; 

(d) a sequence that has at least 65 percent sequence identity with at least 10 
contiguous amino acid residues of a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 

25 37, 39, 41, 141, 160, or 161; and 

(e) a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 
161 that contains at least one conservative substitution. 

28. A nucleic acid molecule comprising a nucleic acid sequence that encodes the 
30 polypeptide of claim 27. 
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29. A transformed cell comprising at least one exogenous nucleic acid molecule, 
wherein said molecule comprises a nucleic acid sequence that encodes the polypeptide of 
claim27. 

5 30. The cell of claim 29, wherein the cell produces 3-HP. 

31 The cell of claim 29, wherein said exogenous nucleic acid molecule encodes an 
E2 a polypeptide of an enzyme having lactyl-Co A dehydratase activity. 

10 32. The cell of claim 29, wherein said exogenous nucleic acid molecule encodes an 
E2 p polypeptide of an enzyme having said lactyl-CoA dehydratase activity. 

33. The cell of claim 29, wherein said exogenous nucleic acid molecule encodes a 
polypeptide having 3-hydroxypropionyl-CoA dehydratase activity or CoA transferase 

15 activity. 

34. The cell of claim 29, wherein said exogenous nucleic acid molecule encodes a 
^jepn^eJiav^ 

C o A hydrolase activity. 

20 

35. The cell of claim 29, wherein the cell comprises lipase activity. 

36. The cell of claim 29, wherein the cell produces an ester of 3-HP. 

25 37. The cell of claim 36, wherein said ester is selected from the group consisting of 
methyl 3-hydroxypropionate, ethyl 3-hydroxypropionate, propyl 3-hydroxypropionate, 
butyl 3-hydroxypropionate, and 2-ethylhexyl 3-hydroxypropionate. 

The cell of claim 29, wherein said cell comprises CoA synthetase activity. 

The cell of claim 29, wherein said cell produces polymerized 3-HP. 



38. 

30 

39. 
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40. The cell of claim 29, wherein said cell is prokaryotic. 

41 . The cell of claim 29, wherein said cell is selected from the group consisting of 
5 Lactobacillus, lactococcus, Bacillus, and Escherichia^eOs. 

42. The cell of claim 29, wherein the cell is a yeast cell. 

43. A specific binding agent that specifically binds to the polypeptide of claim 27. 

10 

44. An isolated nucleic acid molecule comprising a nucleic acid sequence selected 
from the group consisting of: - 

<a) a sequence set forth in SEQ ID NO:l, 9^ \7, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 163; 

15 (b) a sequence having at least 10 contiguous nucleotides of a sequence set forth in 

SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163; 

(c) a sequence mat has at least 65 percent sequence identity with a sequence set 
forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163; 

(d) a sequence that has at least 65 percent sequence identity with at least 10 

20 contiguous nucleotides of a sequence set form in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 
40, 42, 129, 140, 142, 162, or 163; and 

(e) a sequence that hybridize under moderately stringent conditions a sequence set 
forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163. 

25 45. A production cell comprising an isolated nucleic acid molecule of claim 44 that is 
exogenous to said production cell. 

46. The cell of claim 45, wherein said isolated nucleic acid molecule encodes a 
polypeptide having an enzymatic activity selected from the group consisting of-CoA 
30 transferase activity, lactyl-CoA dehydratase activity, CoA synthase activity, CoA 
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dehydratase activity, dehydrogenase activity, malonyl-CoA reductase activity, and 3- 
hydroxypropionyl-CoA dehydratase activity. 

47. A method of producing a polypeptide, comprising culturing the cell of claim 45 
5 under conditions that allow said cell to produce said polypeptide, wherein said 

polypeptide is produced. 

48. A method for making 3-HP, said method comprising culturing at least one cell 
comprising at least one exogenous nucleic acid molecule that encodes at least one 

10 polypeptide that is capable of producing said 3-HP from PEP under conditions such that 
said 3-HP is produced. 

49. The method of claim 48, wherein said cell is'selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

15 

•50. The method of claim 48, wherein 3-HP is made by a biosynthetic route that 
utilizes a p-alanine intermediate. 

51. The method of claim 48, wherein 3-HP is made by a biosynthetic route that 
20 utilizes a malonyl-CoA intermediate. 

52. The method of claim 48, wherein 3-HP is made by a biosynthetic route that 
utilizes a lactate intermediate. 

25 53. A method for making 3-HP, said method comprising culturing at least one cell 
comprising at least one exogenous nucleic acid molecule that encodes at least one 
polypeptide that is capable of producing said 3-HP from lactate under conditions such 
that said 3-HP is produced. 

30 54. The method of claim 53, wherein said cells are selected from the group consisting 
of yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 
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55. A method for making 3-HP, said method comprising culturing at least one cell 
under conditions wherein said ceil produces said 3-HP, said cell comprising lactyl-CoA 
dehydratase activity and 3-hydroxypropionyl-CoA dehydratase activity. 

5 

56. Hie method of claim 55, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

57. The method of claim 55, wherein said cell comprises CoA transferase activity. 

10 

58. The method of claim 55, wherein said cell comprises 3-hydroxypropionyl-CoA 
hydrolase activity or 3-hydroxyisobutryl-CoA hydrolase activity. 

59. A method for making 3-HP, said method comprising: 

15 a) contacting lactate with a first polypeptide having CoA transferase activity to 

form lactyl-Co A, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
dehydratase activity to form acrylyl-CoA, 

c) contacting said acrylyl-CoA with a third polypeptide having 3- 
20 hydroxypropionyl-CoA dehydraMe_actiyity to form 3-HP-CoA, and 

d) contacting said 3-HP-CoA with said first polypeptide to form said 3-HP or with 
a fourth polypeptide having 3-hydroxypropionyl-CoA hydrolase activity or 3- 
hydroxyisobutryl-CoA hydrolase activity to form said 3-HP. 

25 60. A method for making polymerized 3-HP, said method comprising culturing a cell 
under conditions wherein said cell produces said polymerized 3-HP, said cell comprising 
lactyl-CoA dehydratase activity and 3-hydroxypropionyl-CoA dehydratase activity. 

61. The method of claim 60, wherein said cell is selected from the group consisting of 
3t) yeast, Lactobacillus, Lactocoocus, Bacillus, and Escherichia cells. 
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62. The method of claim 60, wherein said cell comprises CoA synthetase activity. 

63. The method of claim 60, wherein said cell comprises poly hydroxyacid synthase 
activity. 

5 

64. A method for making polymerized 3 -HP, said method comprising: 

a) contacting lactate with a first polypeptide having CoA synthetase activity to 

v 

form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
10 ttehydratase activity to form acrylyl-CoA, 

c) contacting said acrylyl-CoA with a third polypeptide having 3- 
hydroxypropionyl-CoA dehydratase activity to form 3-hydroxypropionic acid-CoA, and 

d) contacting said 3-hydroxypropionic acid-CoA with a fourth polypeptide having 
poly hydroxyacid synthase activity to form said polymerized 3-HP. 

15 

65. A method for making an ester of 3-HP, said method comprising culturing a cell 
under conditions wherein said cell produces said ester, said cell comprising lactyl-CoA 
dehydratase activity and 3-hydroxypropionyl-CoA dehydratase activity. 

20 66. The method of claim 65, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

67. The method of claim 65, wherein said-cell comprises CoA transferase activity. 

25 68. The method of claim 65, wherein said cell comprises 3-hydroxypropionyl-CoA 
hydrolase activity or 3-hydroxyisobutryl-CoA hydrolase activity. 

69. A method for making an ester of 3-HP, said method comprising: 

a) contacting lactate with a first polypeptide having CoA transferase activity to 

30 form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
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dehydratase activity to form acrylyl-CoA, 

c) contacting said acrylyl-CoA with a third polypeptide having 3- 
hydroxypropionyl-CoA dehydratase activity to form 3-hydroxypropionic acid^CoA, 

d) contacting said 3-hydroxypropionic acid-CoA with said first polypeptide to 

5 form 3-HP or a fourth polypeptide having 3-hydroxypropionyPCoA hydrolase activity or 
3-hydroxyisobutryl-CoA hydrolase activity to form 3-HP, and 

e) contacting said 3-HP with a fifth polypeptide having lipase activity to form said 

ester. 

10 70. A method for making polymerized acrylate, said method comprising culturing a 
cell under conditions wherein said cell produces said polymerized acrylate, said cell 
comprising CoA synthetase activity and lactyl-CoA dehydratase activity. 

71 . The method of claim 70, wherein said cell is selected from the group consisting of 
15 yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

72. The method of claim 70, wherein said cell comprises poly hydroxyacid synthase 
activity. 

20 73. A method for making polymerized acrylate, said method comprising: 

a) contacting lactate with a first polypeptide having CoA synthetase activity to 
form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyi-CoA 
dehydratase activity to form acrylyl-CoA, and ' 

25 c) contacting said acrylyl-CoA with a third polypeptide having poly hydroxyacid 

synthase activity to form said polymerized acrylate. 

74. A method for making an ester of acrylate, said method comprising culturing a cell 
under conditions wherein said cell produces said ester, said cell comprising CoA 
30 transferase activity and lactyi-CoA dehydratase activity. 
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75. The method of claim 74, wherein said cell is selected from the group -consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

76. The method of claim 74, wherein said cell comprises lipase activity. 

5 

77. A method for making an ester of acrylate, said method comprising: 

a) contac3u^ 
form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
10 dehydratase activity to form acrylyl-CoA, 

c) contacting said acrylyl-CoA with said first polypeptide to form acrylate, and 

-^^^cung saidTaylate"^ to fonn 

said ester. 



15 78. A method for making 3-HP, said method comprising cwturing a cell under 
conditions wherein said cell produces said 3-HP, said cell comprising at least one 
exogenous nucleic acid that encodes at least one polypeptide such that said 3-HP is 
produced from acetyl-CoA and under conditions such that said 3-HP is produced. 

20 79. The method of claim 78, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

80. A method for making 3-HP, said method comprising culturing a cell under 
conditions wherein said cell produces said 3-HP, said cell comprising at least one 

25 exogenous nucleic acid that encodes at least one polypeptide such that said 3-HP is 
produced from malonyl-CoA and under conditions such that said 3-HP is produced. 

81. The method of claim 80, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

30 

82. A method for making 3-HP, said method comprising culturing a cell under 
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conditions wherein said cell produces said 3-HP, said cell comprising at least one 
exogenous nucleic acid that encodes at least one polypeptide such that said 3-HP is 
produced from P-alanine and under conditions such that said 3-HP is produced. 

5 83. The method of claim 82, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

84. A method for making 3-HP, said method comprising culturing cells comprising an 
exogenous nucleic acid that encodes polypeptides that are capable of producing 3-HP 

1 0 from acetyl-CoA under conditions such that said 3-HP is produced. 

85. The method of claim 84, wherein said cells are selected from the group consisting 
of yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

15 86. A method for making 3-HP, said method comprising culturing cells comprising at 
least one exogenous nucleic acid that encodes polypeptides that are capable of producing 
said 3-HP from malonyl-CoA, and under conditions such that said 3-HP is produced. 

87. The method of claim 86, wherein said cells are selected from the group consisting 
20 of yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

88. A method for making 3-HP, said method comprising: 

a) contacting acetyl-CoA with a first polypeptide having acetyl-CoA carboxylase 
activity to form malonyl-CoA, and 
25 b) contacting said malonyl-CoA with a second polypeptide having malonyl-CoA 

reductase activity to form said 3-HP. 

89. A method for making 3-HP, said method comprising contacting malonyl-CoA 
with a polypeptide having malonyl-CoA reductase activity to form said 3-HP. 



30 



90. A method for making 3-HP, said method comprising: 
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a) contacting fi-alanine CoA with a first polypeptide having (i-alanyl-CoA 
ammonia lyase activity to form acrylyl-Co A; 

b) contacting said acrylyl-CoA with a second polypeptide having 3HP-CoA 
dehydratase activity to fonn said 3-HP-CoA; and 

5 c) contacting 3-HP-CoA with a third polypeptide having glutamate dehydrogenase 

to make 3-HP. 
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Figure 6 

atgagaaaagtagaaatcattacagctgaacaagcagctcagctcgtaaaagacaacgac 

acgattaogtctatcggctttgtcagcagcgcccatccggaagcactgatxaaagctttg 

gaaaaacggttgctggacacgaacacccggcagaacttgagctacatctatgcaggctct 

cagggcaaacgcgatggccgrggcgctgaacatctggcacacacagggcttttgaaacgc 

gccatcatcggtcactggcagactgtagcggctatcggtaaactggctgtcgaaaacaag 

attgaagcttacaacttctcgcagggcacgttggtccactggttcegggccttggcaggt 

cataagctcggcgtcttcaccgacatcggtctggaaactttcctcgatccgcgtcagctc 

ggcggcaagctcaatgaggtaaccaaagaagacctcgtcaaactgatcgaagtggatggt 

catgaacagcttttctacccgaccttcccggtcaaggtagctttgctccggggtacgtat 

gctgatgaatccggcaatatcaccatggacgaagaaatcgggcctttcgaaagcacttcc 

gtaggccaggccgttcacaactgtggcggtaaagtcgtcgtccaggtcaaagaggtcgtc 

-gctcacggcagcctcgacccgcgcatggtcaagatcectggcatctatgtcgactacgtc 

gtcgtagcagctcgggaagaccatcagcagacgtatgactgcgaatacgatccgtccctc 

agcggtgaacatcgtgctcctgaaggcgctaccgatgcagctctccccatgagcgctaag 

aaaatcatcggcggccgcggggctttggaattgactgaaaacgctgtcgtcaaccteggc 

gtcggtgctccggaatacgttgcttctgttgccggtgaagaaggtatggccgataccatt 

accctgaccgtcgaaggtggcgccatcggtggcgtagcgcagggcggtggccgcttcggt 

tcgtcccgcaatggcgatgccatcatcgaccacacctatcagttcgacttctacgatggc 

"ggcggtctggacatcgcttacctcggcctggcdcagtgcgatggctcgggcaacatcaac 

gtcagcaagttcggtactaacgttgc<:ggctgcggcggtttccccaacatttcccagcag 

acaccgaatgtttacttctgcggcaccttcacggctggcggcttgaaaatcgctgtcgaa 

gacggcaaagtcaagatcctccaggaaggcaaaggcaagaagttcatcaaagctgtc«ac 

cagatcactttcaacggttcctatgcagcccx3caacggcaaacacgttctctacatcaca 

gaaggctgcgtatttgaactgaccaaagaaggcttgaaactcatcgaagtc-gcaccgggc 

atcgatattgaaaaagatatcgtcgctcacatggacttcaagcggatcattgataatccg 

aaactcatggatgggcgcctcttccaggacggtgccatgggactgaaaaaataa (seq 

ID NO:l) 
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Figure 7 

MRKVEIITAE<JAAQLVKDNDTITSIGFVSSAHPEALTKALEKRFLDTNTPQNLTYIYAGS 
QGKRDGRAMHLAHTGLLKRAIIGHWQTVPAI<3KLAVENKIEAYNFSQGTLVHWFRALAG 
HKLGVFT.DIGLETFLDPRQL<3GKLNDVTKEDLVKLIEVDGHEQLFYPTFPVNVAFLRGTY 
ADESGNITMDEEIGPFESTSVAQAVHNCGGKVWQVKDWAHGSLDPRMVKIPGiYVDYV 
WAAPEDHQQTYDCEYDPSLSGEHRAPEGATDAALPMSAKKII-GRRGALELTENAWNLG 
VGAPEYVASVAGEEGIADTITLTVEGGAIGGVPQGGARFGSSRNADAIIDHTYQFDFYDG 
<3GLDIAYLGLAQCDGSGNINVSKFGTNVAGGGGFPNISQQTPNVYFCGTFTAGGLKIAVE 
DGKVKILQEGKAKKFIKAVDQITFNGSYAARNGKHVLYITERCVFELTraiGLKLIEVAPG 
IDIEKDILAHMDFKPIIDNPKLMDARLFQDGPMGLKK (SEQ ID NO:2) 
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Figure 8 

SEQ ID NO:l 1 atgagaaaagtagaaatcattacagctgaacaagcagetc—agctcgta 

SEQ ID NO: 3 1 gtgccggtcctgtcggcacaggaagcggtga— attatatt 

SEQ ID NO: 4 1 atgccgattctctcaaaaatatgggcggctccagcagctggaatcttgag 

SEQ ID NO: 5 1 at * aa t « ca 

SEQ ID NO: 1 49 aaagacaacgacacgattacgtctatcggctttgtcagcagcgcccatcc 

SEQ ID NO: 3 40 cccgacgaagcaacactttgtgtgttaggcgctg gcggcggtattct 

SEQ ID NO: 4 SI aaaaactccgagaaatgctcatcaaatgaggctaatctcaatga-catcc 

SEQ ID NO: 5 10 aaaga atta a teg 

SEQ ID NO:l 99 ggaagcactgaccaaagctttggaaaaacggttcctg 

SEQ ID NO: 3 v 87 ggaag ccaccacgtt—aattactgctcttgctgataaatataa 

SEQ ID NO*:4 100 tcgatgaaagcaaaagtcttt aactctgc 

SEQ ID NO: 5 23 

SEQ ID NO:l 136 gacacgaacacrccgcagaacttgacctacatctatgcag-gctctc 

SEQ ID NO: 3 129 acagactcaaacaccacgt— aatttatcgattattagtccaa-cagggc 

SEQ ID NO: 4 129 cgaagaagccgtgaaggatattccagat-aatgcaaagctttt 

SEQ ID NO: 5 23 ctcgccgaat* 

SEQ ID NO:l 182 agggcaaacgcgatggccgtgccgctgaacatctggcacacacaggcctt 

SEQ ID NO: 3 176'ttggcgatcgcgccgaccgtggtattagtcctctggcgcaagaaggtctg 

SEQ ID NO: 4 171 a gttggc— ggcttcggactatgcgg-aatcccagaaaat 

SEQ ID NO: 5 34 gcgatgg 

SEQ ID NO:l 232 ttgaaacgcgocatcatcggtcactggcagactgtaccggc-tatcggta 

SEQ ID NO: 3 226 gtgaaatgggcattatgtggtcactgg-ggacaatcgccgcgtatttctg 

SEQ ID NO: 4 208 ctcatccaagctatca-caaaaaqtggtcaa — aaaggtc 

SEQ ID NO: 5 41 ; aattacatgatgga ga-tattgtta 

SEQ ID NO:l 281 aactggctgtcgaaaacaagattgaagcttacaacttctcgcagggcacg 

SEQ ID NO: 3 275 aactcgcagaacaaaataaaattattgcttataactacccacaaggtgta 

SEQ ID NO: 4 245 ttacatgtgtatcaaacaatgcgggagttgataatt ggggac- 

SEQ ID NO: 5 65 ateteggt attg— gtttac caacacagg 

SEQ ID NO:l 331 ttggtccactggttccgcgccttggcaggtcataagctcggcgtcttcac 

SEQ ID NO: 3 325 cttacacaaaccttacgcgccgccgcagcccaccagcctggtattattag 

SEQ ID NO: 4 287 ttggcttgctccttc—aaactcgacaaatc—aagaaaatgatctcatc 

SEQ ID NO: 5 92 ttgt- taattatttacctgataatgtcaata ttac 

SEQ ID NO:l 381 cgacatcggtct ggaaa ctttcctcgatccccgtcagctcggc 

SEQ ID NO: 3 375 tgatattggcat eggga catttgtcgatccacgccagcaaggc 

SEQ ID NO: 4 333 gtacgtcggtgaaaacggaga atttgetega eaatatcttagc 

SEQ ID NO: 5 126 — acttcaatca gaaaatggctttcttggtttaactgca 

SEQ ID NO: 1 424 ggcaagctcaatgacgtaacca aagaagacctcgtcaaactgat 

SEQ ID NO: 3 418 ggcaaactgaatgaagtcacta aagaagacctgattaaactggt 

SEQ ID NO: 4 '376 ggagagctcgagttggaattcacaccacaaggaacactcgccgaacgaat 

SEQ ID NO: 5 163 ~ tttgac cca gaaaatgetaattcaaact 

SEQ ID NO:l 4*8 cgaagtcgatggtca tgaacagcttttctacccgacc 

SEQ ID NO: 3 462 cgagtttgataacaa agaatatctctattacaaagcg 

SEQ ID NO: 4 426 tcgtgcagctggtgccggtgttcccgcattctacac-accaacaggatac 
SEQ ID NO:S 191 — tagtaaatgetgg tggtcagcett 
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SEQ ID NO:l 505 — ttcccgg— tcaacgtagctttcctccgcggtacgtatgctga tg 

SEQ ID NO: 3 499 — attgcgc— cagatattgccttcattcgcgctaccacctgcga ca 

SEQ ID NO: 4 475 ggtacccagattcaagaaggaggtgctccga-ttaagtacagtaaaactg 

SEQ ID NO: 5 215 gtggaa ttaa aa 

SEQ ID NO:l 548 aatccggcaatatc-accatggacg aagaaatcgggcctttc 

SEQ ID NO: 3 542 gtgaaggctacgcc-acttfctgaag atgaggtgatgtatctc 

SEQ ID NO: 4 524 aaaaaggaaagattgaagttgcaagtaaagcgaaagaaacacgacaattc 

SEQ ID NO: 5 227 aaggcggctcta ctttt 

SEQ ID NO:l 589 ga- — aagcacttccgta gcccaggccgttcac — aactgtggcggt 

SEQ ID NO: 3 583 ga cgcattggttattgcccaggcggtgcac— aataacggcggt 

SEQ ID NO: 4 574 aatggaattaattatgtaatggaagaggctatttggggagattttgcatt 

SEQ ID NO: 5 244 ga tagtgctt t—ttctttcgcttt 

SEQ ID NO:l €31 aaagtcgtcgtccaggtcaaagacgtcgtcgc tcacggcagcctc 

SEQ ID NO: 3 €25 attgtgatgatgcaggtgcagaaaatggttaa gaaagccacgctg 

SEQ ID NO: 4 €24 gatcaaggcgtggagagcagatac-tcttggaaatattcaattcagacat 

SEQ ID NO: 5 267 aa r : ttc 

SEQ ID NO:l €76 gacccgcgcatggtcaagatccctg gcatctatgtcgactac 

SEQ ID NO: 3 €70 catcctaaatctgtccgtattccgg g ttatctggtggat 

SEQ ID NO: 4 673 .gctgctggaaatttcaataatccaatgtgcaaagcctctaaatgcac — c 

SEQ ID NO: 5 272 gtggcggtcatgtt gatgcctg tgtgctaggtggact — 

SEQ ID NO:l 718 gtcgtcgtagcagctccggaagaccatcagcag — acgtatgactgcgaa 

SEQ ID NO: 3 709 attgtggtggtcgatccg gafccaaacccaa— -ctgtatggcggtgca 

SEQ ID NO: 4 721 atcgtcgaagtag aggaaatcgtcgaaccgggagtaattgctccaaa 

SEQ ID NO: 5 309 ~ 

SEQ ID NO:l 766 t acgatccgtccctcagcggtgaacatcgtgctcctg-aaggc 

SEQ ID NO: 3 754 c cggttaaccgctttatttctggtgacttcacccttg-atgac 

SEQ ID NO: 4 7€8 cgatgtgcacattccatcaatctattgtcatcgtctagttttgggaaaga 

SEQ ID NO:5 309 tg-aagtt 

SEQ ID NO:l 808 gctac cgatgcagc tctccccatgagcgctaaga 

SEQ ID NO: 3 796 agtac caaacttag cctgcccetaaac-caacgt 

SEQ ID NO: 4 818 actacaaaaaaccaatcgaacggccaatgttcgcacacgaaggaccaata 

SEQ ID_NO:5 316 gatca agaagcaaa tctcgc 

SEQ ID NO:l 842 aaatcatcggc-cgccgcggcgctttggaattgactgaaaacgctgtcgt 

SEQ ID NO: 3 829 aaattagttgcgcggcgcgcattattcgaaatgcgtaaaggcgcggtggg 

SEQ ID NO: 4 868 aaaccatctac-atcggc — tgctggaaaatcgagagaaatcattg-cag 

SEQ ID NO: 5 336 taactgga 

SEQ ID NO:l 891 caacctcggcgtcggtgctcc ggaat — acgttgcttctgttgcc 

SEQ ID NO: 3 879 gaatgtcggcgtcggtattgc tgacg — gcattggcctggtcgcc 

SEQ ID NO: 4 914 cacgtgcagctttggagttcacagatggaatgtacgccaatttgggtatc 

SEQ ID NO: 5 344 tggtgcc 

SEQ ID NO: 1 934 gg — tgaagaaggtatcgccga tacca ttaccctgac 

SEQ ID NO: 3 922 <:g — agaagaaggttgtgctga tgact ttattctgac 

SEQ ID NO: 4 9€4 gggattccgactttggcgccaaattatataocaaatggatttactgttca 

SEQ ID NO: 5 351 tg — gcaaaatggta ■ ; 

SEQ ID NO:l 969 cgtcgaaggtg gcgccatcggtggcgt-accgcagggcggtgcc 

SEQ ID NO: 3 957 ggtagaaacag gtccgattggcggaattacttcacaggggatcg 

SEQ ID NO: 4 1014 tttgcaaagtgagaatggtattattggagtggg-aceata tcca 

SEQ ID NO:5 3€4 
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SEQ ID NO:l 1012 cgcttcggttcgtcccgca-atgccgatgccatca tcgaccacacc 

SEQ 10 NO: 3 1001 c-ctttggcgcgaacgtga-atacccgtgccattc tggatatgacg 

SEQ ID NO: 4 1057 agaaaag gaacagaagacgccgatctcattaatgctggaaaagagc 

SEQ ID NO:S 364 ccagga-atg 

SEQ ID NO:l 1057 tatcagttcgacttctacgatggcggc ggtctggacatcg 

SEQ ID NO: 3 1045 tcccagtttgatttttatcacggtggc ggtctggatgttt 

SEQ ID NO: 4 1103 caattactcttct-caaaggagcttcaattgttggttctgatgaatc 

SEQ ID NO: 5 373 ggcgga gcaatggacttag 

SEQ ID NO:l 1097 cttacctcggcctgg cccagtgcgatg gcfccgggcaac 

SEQ ID NO: 3 1085 gttatttgagttttg ctgaagtcgacc agcacggtaac 

SEQ ID NO: 4 1149 attcgcaatgattcgtggttctcatatggatattactgtgctcggtgcac 

SEQ ID NO: 5 392 tg actggtgcaa- 

SEQ ID NO:l 1135 atcaacgtcagca-agttcggtactaacgttgccggctgcggcggtfctTC 

SEQ ID NO: 3 1123 gtcggcgtgcata-aattcaatggtaaaatcatgggcaccggtggattta 

SEQ ID NO: 4 1199 ttca—gtgctcacagtttgg agatttagcgaattggatgattecg 

SEQ ID NO: 5 404 : ~ 

SEQ ID NO:l 1184 ccaacatt — tcccagcagacaccgaatgtttacttctgcggcacct-tc 

SEQ ID NO: 3 1172 ttgatatcagtgccacttcgaagaaaatcatt — ttctgcggcacat-ta 

SEQ ID NO: 4 1243 •ggaaaatt ggtga-aaggaatgggcggtgcaatggatcttgte 

ID NO: 5 404 ■ aaaaagtgattatt ggca 



SEQ ID NO:l 1231 acggctggcggcttgaaaatcgctgtcgaagacggcaaagtcaaga*cct 

"SEQ~ ID'NO: 3 "" 1219 actgcgggcagtttaaaaacagaaa 

SEQ ID NO: 4 1285 tctgctcccgg agcccgtgt-gatcgttgtaatggagcatgtat 

SEQ ID NO: 5 422 tggaacattg tgccaagtcaggttoct 

SEQ ID NO:l 1281 ccaggaaggcaaagccaagaagttcatcaaagctgtcgaccagatcactt 

SEQ ID NO: 3 1269 ccaggaaggacgggtgaagaaatttattcgggaactaccggaaattactt 

SEQ ID NO: 4 1328 cgaagaacggagagccaaaaatt ctagagcactg 

SEQ ID NO: 5 449 ca'aaaattctaaag aaatgtacattaccget cacagcaagt 

SEQ ID NO:l 1331 tcaacgg ttcctatgcagc ccgcaacggcaaacacgttctct 

SEQ ID NO: 3 1319 tcagcggaaaaatcgctctcgagc gagggctgg atgttcgtt 

SEQ ID NO: 4 1362 cgaac ttcctctga~c cggcaaagg— agtaat^tcccg 

SEQ ID NO: 5 490 aaaaaag ttgccatggtggttaccgaat^ggca gtattta 

SEQ ID NO:l 1373* a — catcacagaacgctgcgtatttgaaetgacca — aagaa-ggcttga 

SEQ ID NO: 3 1361 a— tatcactgagcgcgcagtattcacgctgaaag — aagac-ggcctgc 

SEQ ID NO: 4 1398 aatcattactgatatggcagttttcgacgtggacacaaagaacggattga 

SEQ ID NO: 5 530 a— ssttcattgaaggcagattagttcta a — aagaa catgc 

SEQ ID NO:l 1418 aactcatcgaagtc^foaccgggcatcgatattgaaaaagatatcctcgct 

SEQ ID NO: 3 1406 atttaatcgaaategcccctggcgtcgatttacaaaaagatattctcgac 

SEQ ID NO: 4 1448 cattgatcgaagt— caggaaggatc-ttactgtagatgatat 

SEQ ID NO: 5 567 tcctcat gtggatttagaaaca attaaagcc 

SEQ ID NO:l 1468 cacatggacttcaagccgat—cattgata atccga— aactcatgg 

SEQ ID NO: 3 14S6 aaaatggatttcaccccagt~gatt1:cgccagaactca— aactgatgg 

SEQ ID NO: 4 1488 — caagaaactca — ccg cttgcaa attcga— aatttccga 

SEQ ID NO: 5 598 aaaacag aagccgafcttcattgtt gccgatgatttcaaag 

SEQ ID NO:l 1511 atgeccgcctcttccaggacggtcccatggga ctgaaaaaa 

SEQ ID NO: 3 1502 acgaaagattatttatcgatgcggcgatgggttttgtcctgcctgaagcg 

SEQ ID NO: 4 1524 aaatctgaagccaatgggacaggctcctcfcta atcaaggataa- 

SEQ ID NO: 5 638 aaatgcaaatcagccag aaagga cttgaat^atga 
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SEQ ID NO:l 1552 taa 

SEQ ID NO: 3 1552 gctcattaa 

SEQ ID NO: 4 1567 

SEQ ID NO: 5 €73 
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Figure 9 

SEQ ID NO: 2 1 mrkveiit — aeqaaqlv 

SEQ ID NO: 6 1 mpvls aqeavnyi 

ID NO: 7 1 mpilskiwaapaagilrktprnahqmrlismtssmkakvfnsaeeavkdi 
ID NO: 8 1 ronakeli arriamel 



SEQ ID NO: 2 17 kdndtitsigfvssahpealt™ kalekrf ldtntpqnltylyagsqgkr 

SEQ ID NO: 6 14 pdeatlcvlg-agggileattlitaladkykqtqtprxxlsiisptglgdr 

SEQ ID NO: 7 51 pdnakllvggfglcgipenli--qai tktgqkgltcvsnnagv- 

SEQ ID NO: 8 16 hdgd-ivnlg- 

SEQ ID NO: 2 65 dgraaehlahtgllkraiighwqtvpaigklavenkieaynfsqgtlvhw 

SEQ ID NO: 6 63 adrgisplaqeglvkwalcghwgqspriselaeqnkiiaynypqgvltqt 

SEQ ID NO: 7 92 dnwglglllqtrqikkmissyvgengefarqylsgeleleftpqgtlaer 

SEQ ID NO:8 25 — 

SEQ ID NO: 2 115 fralaghklgvftdigletf ldprqlggklndvtkedlvkliev 

SEQ ID NO: 6 113 lraaaahqpgiisdigigtfvdprqqggklnevtkedliklvef 

SEQ ID NO: 7 142 iraagagvpafytptgygtqi qeggapikysktekgk-ievaskake 

SEQ ID NO: 8 25 igl ^ 



ID NO: 2 159- dgheqlfyptfpvnvaflrgtyadesgnitmdeeigpfestsvaqa 

SEQ ID NO: 6 157 dnkeylyykaiapdiaf irattcdsegyatfedevmyldalviaqa 

SEQ ID NO: 7 188 trqfnginyvmeeaiwgdf alikawradtlgniqf rhaagnfnnpmckas 

SEQ ID NO: 8 28 P^<FT^ ;-~y^^y^tlqsengflglta 

SEQ ID NO: 2 205 vhncggkvvvqvkdwahgsldprmvkipgiyvdyvwaapedhqqtydc 

SEQ ID NO: 6 203 vhimggivmmqvqkmvkkatlhpksvripgylvd--ivvvdpdqtqlygga 

SEQ ID NO: 7 238 — kc tiveveeivepgviapndvhipsiychrlvlg knykk 

SEQ ID NO: 8 55 ' 

SEQ ID NO: 2 255 eydpslsgehrapegatdaalpmsakkiigrrgaleltenavvnlgvg — 

SEQ ID NO: 6 252 pvnrf isgdf tl-ddstklslplnqrklvarralfemrkgavgnvgvg— 

SEQ ID NO: 7 277 pierpmfahegpikpstsaa — gksreiiaaraaleftdgmyanlgigip 

SEQ ID NO: 8 55 -fdp enansnl-vn — 

SEQ ID NO: 2 303 — apeyvasvageegiadtitltveggaig — gvpqggarfgssrnad-- 

SEQ ID NO: 6 299 — iadgiglvareegcaddfiltvetgpig — gitsqgiafganvntr — 

SEQ ID NO: 7 325 tlapnyipn gftvhlqsengiigvgpyprkgtedadlinagke 

SEQ ID NO: 8 67 —a ggqpc—gikkggstf 

SEQ ID NO: 2 347 aiidhtyqfdfydgggldiaylglaqcdgsgni-nvskfgtn 

SEQ ID NO: 6 343 aildmtsqfdfyhgggldvcylsfaevdqhgnv-gvhkfngk 

SEQ ID NO: 7 368 pitllkgasivgsdesfamirgshiaditvlgalqcsqfgdlanwmipgkl 

SEQ ID NO: 8 82 dsaf sfalirgghvdacvlgglevdqeanlanwmvpgkm 

SEQ ID NO: 2 388 vagcggfpnisqqtpnvyfcgtftagglkiav edgkvkilqegk 

SEQ ID NO: 6 384 imgtggfidisatskkiifcgtltagslktei tdgkinivqegr 

SEQ ID NO: 7 418 vkgmggamdl — vsapgarvi wmehvs kngepkilehce 

SEQ ID NO: 8 121 vpgmggamdlvtgakkvii- gmehca ksgsskilk 

SEQ ID NO: 2 432 akkf ikavdqitfngsyaarngkhvl—yitercvfel-tkeglklieva 

SEQ ID NO: 6 428 vkkfirelpeitfsgkialergldvr—yiteravftl-kedglhlieia 

SEQ ID NO: 7 456 Ipltgkgvisriitdmavfdvdtkngltlievr 

SEQ ID NO: 8 155 kctlplt askkvam—wtelavfnf-iegrlvlkeha 
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SEQ ID NO: 2 479 pgidiekdi—lahindf kpiidnp-klmdarlfqdgpmglkk 

SEQ ID NO: 6 475 pgvdlqkdi—ldkmdf tpvispelklmderlf idaamgfvlpeaah 

SEQ ID NO:7 489 kdltvd-dikkltackfe-isenl-kpmgqaplnqg 

SEQ ID NO: 8 190 phvdle-ti— kakteadfivad dfkemqisqkglel 
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Figure 10 

gtgaaaagtgtgtatactctcggaatcgaggttggttct^ 

ctggaagatg«:aagaagatcgtcgcccatgccgtcgttgaaatcggcag<iggttcgacc 

ggtccggaacgcgtcctggacgaagtcttcaaagataccaacttaaaaattgaagacatg 

gcgaacatcatcgccacaggctatggccgtttcaatgtggactgcggcaaaggcgaagtc 

agcgaaatcacgtgccatgccaaaggggggctctttgaatgccccggtacgacgaccatc 

ctcgatatgggcggtcaggacgtgaagtcgatcaaattgaatggccagggcctggtcatg 

cagtttgccatgaacgacaaatgcgecgctggtacgggcggtttcctcgacgtcatgtcg 

aaggtactggaaatccccatgtctgaaatgggggactggtacttcaaatcgaagcatccc 

ggtgcggtcagcagtacctgcacggtttttgctgaatcggaagtcatttcccttctttcc 

aagaatgtcccgaaagaagatatggtagccggtctcgatcagtccatcgccgccaaagcc 

tgcggtctcgtgcgccgcgtcggtgtcggtgaagacgtgaccatgaccggcggtggctcc 

<:gcgatccgggcgtcgtcgatgccgtatggaaagaattaggtattcctgtcagagtcgct 

ctgcatcccgaagcggtgggtgctctcggagctcctttgattccttatgataaaatcaag 

AAATAA iSEQ ID NO: 9) 
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Figure 11 

VKTVYTLGIDVGSSSSKAVILEDGKKIVAHAWEIGTGSTGPERVLDBVFKDTNLKIEDM 
AN 1 1 ATG YGRFNVDC AKGE VS E I TCHAKG ALFECPGT TT I LDI GGQDVKS I KLNGQGLVM 
QFAMNDKCAAGTGRFLDVMSKVLEIPMSEMGDWYFKSKHPAAVSSTCTVFAESEVISLLS 
KNVPKEDIVAGVHQSIAAKACALVRRVGVGEDLTMTGGGSRDPGWDAVSKELGIPVRVA 
LHPQAVGALGAALIAYDKIKK tSEQ ID NO: 10) 
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Figure 12 

SEQ ID NO: 9 1 gtgaaaactgtgtatactct<:ggaatcgacgttggttcttcttctt<:caa 

SEQ ID NO; 11 1 atgagtatctataccttgggaatcgatgttggatctactgcatccaa 

SEQ ID NO: 12 1 gtggcagtggcatattcgattggcattgattccggctcaaccgocaccaa 

SEQ ID NO: 13 1 atgattttagggatagatgttggatctacaacaacgaa 

SEQ ID NO: 9 51 ggcagtcatcctggaagatggcaagaagatcgfccgc-ccatgccgtcgtt 

SEQ ID NO: 11 48 gtgcattatcctgaaagatggaaaagaaatcgtggc-gaaatccctggta 

SEQ ID NO: 12 51 agggatcttactggcagacggcgtgatta cgcgccgtttcctcgtt 

SEQ ID NO: 13 39 gatggttctaatggaagatagc aagataatttg-gtataagatagag 

SEQ ID NO: 9 100 gaaatcggcaccggttcgaccggtccggaacgcgtcctggacgaagtctt 

SEQ ID NO:H r 97 gccgtggggaccggaacttccggtcccgcacggtctatttcggaagtcct 

SEQ ID NO: 12 97 ccaa ccccctttcgcccgg-caacagcaattact gaagcctg 

SEQ ID NO: 13 85 gatattgg-agttgtta ttgaggaagatattttattaaaaatggt 

SEQ ID NO:9 150 caaagatacc-aacttaaaaattgaagacatggcgaacatcatcgc-cac 

SEQ ID NO: 11 147 ggaaaatgcc-cacatgaaaaaagaagacatggcctttaccctggc-tac 

SEQ ID NO: 12 138 ggaa-actct-gcgcgaagggttagagacaacgccgtttctgacgctcac 

SEQ ID NO: 13 129 taaggagattgaacaaaaatatccaatagat aaaatcgttgc-aac 

SEQ ID NO: 9 198 aggctatggccgtttcaatgtcg -actgcgccaaaggcgaag 

SEQ ID NO: 11 195' cggctacggacg caat-tcgctggaaggcattgccgacaagcaga — 

SEQ ID NO: 12 186 cggctacgggcggcaactggtgg attttgccgataaacagg 

SEQ ID NO: 13 174 tggatatggaaggcataaggtta gttttgcagataagatag 

SEQ ID NO: 9 239 tcagcgaaatcacgtgccatgccaaaggggcc ctctttgaatgcccc 

SEQ ID NO: 11 239 tgagcgaactgagctgccatgccatgggcgcc agctttatctggccc 

SEQ ID NO: 12 227 taacggaaatctcctgtcacgggctgggcgca cggtttcttgcgcca 

SEQ ID NO: 13 215 ttccagaagtta-ttgcattgggaaaaggagctaactatttct^taacga 

SEQ ID NO: 9 286 ggtacgacga — ccatcctcgatatcggcggtcaggacgtcaa-gtccat 

SEQ ID NO: 11 286 — aacgtccataccgtcatcgatatcggcgggcaggatgtgaa-ggfccat 

SEQ ID NO: 12 274 gcaacgcgcg—cggtaatcgacatcggtggtcaggacagcaaagtgatt 

SEQ ID NO: 13 264 ggcagatgga gttatagacattggagggcaagatacaaa-ggtctt* 

SEQ ID NO: 9 333 caaattga — atggccagggcctggtcatgcagtttgcc-atgaacgaca 

SEQ ID NO: 11 333 ccatgtgg—aaaacgggaccatgacca atttccag-atgaatgata 

SEQ ID NO: 12 322 cagcttgatgatgacggtaacctg— tgcgatttcctgatgaatgaca 

SEQ ID NO: 13 309 aaagattg — ataaaaacggaaaagttgttgattttatc-ctatcagata 

SEQ ID NO: 9 380 aatgcgccgctggtacgggccgtttcctcgacgtcatgtcgaaggtactg 

SEQ ID NO: 11 377 aatgcgctgccgggactggccgtttcctggatgttatggccaatatcctg 

SEQ ID NO: 12 368 aatgcgcggcgggcaccgggcgtttcctggaggtgatctcgcgcacgctt 

SEQ ID NO: 13 356 aatgtgccgctggaactggaaaattcttaga aaaggcatta 

SEQ ID NO: 9 430 gaaatccccatgtct-ga— aatgggggactggtactt-caaatcgaagc 

SEQ ID NO: 11 427 gaagtgaaggtttcc-ga — cctggctgagctgggagc-caaatccacca 

SEQ ID NO: 12 418 ggca—ccagcgtcgagc— aactcgacageattaccg-aaaat gtc 

SEQ ID NO: 13 397 gatattttaaaaatt-gataaaaatgagataaataaatacaaafccagata 

SEQ ID NO: 9 476 atcccgct-gccgtcagcagtacctgcacggtttttgctgaatcggaagt 

SEQ ID NO: 11 473 aacgggtg-gctatcagctccacctgtactgtgtttgcagaaagtgaagt 

SEQ ID NO: 12 460 acgccgcacgccatcacgagtatgtgcacagtgtttgctgaatcagaagc 

SEQ ID NO: 13 446 atatcgct-aaaatatcttcaatgtgtgctgtctttgctgaaagtgagat 
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SEQ ID NO : 9 . 525 catttcccttctttccaagaatgtcccgaaagaa — gatatcgtagccgg 

SEQ ID NO: 11 522 <:atcagccagctgtccaa — aggaaccgacaagatcgacatcattgccgg 

SEQ IDNO:12 510 gatcagcctgcgctcagcgggcgtcgcgccagaa — gcgattctcgcagg 

SEQ ID NO: 13 495 aataagcttactatcaaaaaaagttccaaaggaa — ggcattttaatggg 

SEQ ID NO: 9 573 tgtccatcagtccatcgccgccaaagcctgcgctctcgtgc-gccgcgtc 

SEQ ID NO: 11 570 gatccatcgttctgtagccagccgggtcattggtGttgcca-atcgggtg 

SEQ ID NO: 12 558 agtgattaacgcgat-ggcgcggaggagtgc-caatttcat-tgctcgtc 

SEQ ID NO: 13 543 cgtctatgagagtat aataaatagggttafccccaatgaccaata 

SEQ ID NO: 9 622 ggtgtcgg — tgaagacctgaccatgaccggcggtggctcccgcgat — c 

SEQ IDNO:ll 619 gggattgt — gaaagacgtggtcatgaccggcggtgtagcccagaac — t 

SEQ ID NO: 12 605 tctc-^ctg — tgaagcgccgattctgtttactggtggcgttagtcattgc 

SEQ ID NO: 13 567 ggcttaaaattcaaaacatagtgtttagtggaggagttgctaaaaa*. — a 

SEQ ID NO: 9 668 ccggcgtcgtcgatgccgtatcgaaagaat • — taggtattcctgtc 

SEQ ID NO: 11 665 atggcgtgagaggagccct ggaag aaggccttggcgtg 

SEQ ID NO: 12 652 cagaagt ttgcccggatgctggaatctcacctgcgaatgccggta 

SEQ ID NO: 13 635 aggttttggttgagatgtttgagaaaaaat tgaataaaaaacta 

SEQ ID NO: 9 712 agagtcgctctgcatccccaagcggtg ggtgctctcggagctgc 

SEQ ID NO: 11 703 gaaatcaagacgtctcccctggctcagtacaacggtgccctgggtgccgc 

SEQ ID NO: 12 697. aatacccatcctgatgcgcaatttgct ggcgcaattggcgcggc 

SEQ ID NO: 13 679 ctaattccaaaagaaccacagattgtt- tgctgtgttggagctat 

SEQ ID NO: 9 756 tttgattgctta tgataaaatcaagaaa-taa 

SEQ ID NO: 11 753 tctgtatgcgta t-aaaaaagcagccaaataa 

SEQ ID NO: 12 741 ggtaattggtcaacgagtgaggacacgccgatga 

SEQ ID NO: 13 723 attggtt taa 
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Figure 13 

SEQ ID NO: 10 1 vktvytlgidvgsssskaviledgkkivahaweigtgstgpervldevf 

SEQ ID NO: 14 1 ms-iytlgidvgstaskciilkdgJceivakslvavgtgtsgiparsisevl 

SEQ ID NO: 15 1 mavaysigidsgstatkgilladg-vitrrf lvpt pfrpataiteaw 

SEQ ID NO: 16 1 -milgidvgstttkmvlmeds-kilwykiedigv—vieedillkmv 

SEQ ID NO: 10 51 kdtnlkiedmaniiatgygrfnvd-cakgevseitchakgalfeqpgttt 

SEQ ID NO: 14 50 enahmkkedmaftlatgygxmslegiadkqmselschamgasfiwpnvht 

SEQ ID NO: 15 47 etlreglettpfltltgygrqlvd-fadkqvteischglgarflapatra 

SEQ ID NO: 16 44 keieqkyp-idkivatgygrhkvs-fadlcivpevialgkganyf fneadg 

SEQ ID NO: 10 100 . ildiggqdvksiklngqglvmqf arandkcaagtgrf Idvmskvleipmse 

SEQ ID NO: 14 100 vidiggqdvkvihve-ngtmtnfqmndkcaagtgrf ldvmanilevkvsd 

SEQ ID NO: 15 96 vidiggqdskviqldddgnlcdflmndkcaagtgrf levisrtlgtsveq 

SEQ ID NO: 16 92 vidiggqdtkvlkidkngkwdf ilsdkcaagtgkf lekaldilkidkne 

SEQ ID NO: 10 150 mgdwyf kskhpaavsstctvfaesevisllsknvpkedivagvhqsiaak 

SEQ ID NO: 14 149 laelgakstkrvaisstctvfaesevisqlskgtdkidiiagihrsvasr 

SEQ ID NO: 15 146 1-dsitenvtphaitsmctvfaeseaislrsagvapeailagvinamarr 

SEQ ID NO: 16 142 ink--yksdniakissmcavfaeseiisllskkvpkegilmgvyesiinr 

SEQ ID NO: 10 200 acalvrrvgvgedltmtgggsrdpgvvdavskelgipvrvalhpqavgal 

SEQ ID NO: 14 199' viglanrvgivkdwmtggvaqnygvrgaleeglgvelktsplaqyugal 

SEQ ID NO: 15 195 sanf iarlsceapilftggvahcqkfarmleshlrn5>vnthpdaq£agai 

SEQ ID NO: 16 190 vipmtnrlkl-qptiivf sggvaknkvlvemfekklnkyaiipkepqivccv 

SEQ ID NO: 10 250 gaaliaydkikk— 

SEQ ID NO: 14 249 gaalyaykkaak— 

SEQ ID NO: 15 245 gaavig-qrvrtrr 

SEQ ID NO: 16 239 gaily 
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Figure 14 

ATGAGTGAAGAAAAAACAGTAGATATTGAAAGCATGAGCTCCAAGGAAGCCC^ 

TTCTTGCCGAAAGTCGATGAAGACGCACGTAAAGCGAAAAAAGi^^ 

TGGTCCGCTTCTGTCGCTCCTCCGGAATTCTGCACGGCTATGGACATCGCCATCGTCTAT 

CCGGAAACTCACGCAGCTGGTATCGG3^CCGT<^^ 

GCTGAAAACAAAGGTTAC^CCAGGACATCTGTTCCT^^ 

AT<3GAACTCCTCAAACAGCAGGCTCTGACAGGCGAAACGCCGGAAGTCCTCAAAAACTCC 
CCGGCTTCTCCGATTCCCCTTCCGGATGTTX3TCCTCACTTGCAACAACATCTGCAATACC 
TTGCTCAAATGGTATGAAAACTTGGCTAAAGAATTGAACGTACCTCTCATCAACATCGAC 
GTACCGTTCAACCATGAATTCCCTGTTACGAAACACGCTAAACAGTACATCGTCGGGGAA 
TTCAAACATGCTATCAAACAGCTCGAAGACCTTTGCGGCCGTCCCTTCGACTATGACAAA 
TTCTTCGAA<5TACAGAAACAGACACAGCGCTCCATCGCTGCCTGGAACAAAATCGCTACG 
TACTTCCAGTACAAACGGTCGCCGCTCAAGGGCTTCGACCTCTTCAACTACATGGGGCTC 
GCCGTTGCTGGCCGCTCCTTGAACTACTCGGAAATCACGTTCAACAAATTCCTCAT^AGAA 
TTGGACGAAAAAGTAGCTAATAAGAAATGGGCTTT€GGTGAAAACGAAAAATGCCGTGTT 
ACTTGGGAAGGTATCGCTGTCTGGATCGCTCTCGGCCACACCTTCAAAGAACTCAAAGGT 
CAGGGCGCT€TCATGACTGGTTCCGCTTATCCTGGCATGTGGGACGTTTGCTACGAAC€G 
GGCGACCTCGAATCCATGGCAGAAGCTTATTCCCGTACATACATCAACTGCTGCCTCGAA 
CAGCGCGGTGCTGTTCTTGAAAAAGTTGTCCGCGATGGCAAATGCGACGGCTTGATCATG 
CACCAGAACCGTTCCTGCAAGAACATGAGCCTCCTCAACAACGAAGGCGGCCAGCGCATC 
CAGAAGAACCTCGGCGTACCGTACGTCATCTTGGACGGCGACCAGACCGATGCTGGTAAC 
TTCTCGGAAGCACAGTTGGATACCCGCGTAGAAGCTTTGGCAGAAATGATGGCAGACAAA 
AAAGCCAATGAAGGAGGAAACCACTAA (SEQ ID NO: 17) 
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Figure 15 

MSEEKTVDIE SMS SKEALGYFLPKVDEDARKAKKEGRLVCWS AS VAPPEFCTM5DIAI VY 

PETHAAGI<3ARHGAPAMLEVAENKGYNQDICSYCRVNMGYMELLKQQALTGETPEVLKNS 

PASPIPLPDWLTCNNICNTLLKWYENLAKELNVPLINIDVPFNHEFPVTKHAKQYIVGE 

FKHAIKQLEDIiCGRPFDYDKFFEVQKQTQRSIAAWNKIATYFQYKPSPLNGFDLFNYMGL 

AVAARSLNYSEITFNKFLKELDEKVANKKWAFGENEKSRVTWEGIAVWIALGHTFKELKG 

<K3ALMTGSAYPGMWDVSYEPGDLESMAEAYSRTYINCCLEQRGAVLEKVVRE)GKCDGLI 

HQNRSCBCNMSLLNNEGGQRIQKNLGVPYVIFDGDQTDARNFSEAQFDTRVEALAEI®1ADK 

KANEGGNH <SEQ ID NO: 18) 
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Figure 16 

SEQ ID NO: 17 1 atgagtgaagaaaaaacagtagatattgaaagcatgagctccaaggaagc 

SEQ ID NO: 19 1 atg ccaaagacagta agccctggcgttcagg 

SEQ ID NO:20 1 atgatgaaattaaag— gcaattgaaaagttga—tgcaa 

SEQ ID NO: 21 1 atgtcacttgtcaccga tcta— cccgc 

SEQ ID NO: 17 51' cctt ggttacttcttgccgaaa—gtcgatgaagacgca -c 

SEQ ID NO: 19 32 -cat tgagagatgtagttgaaaaggtttacagagaactg c 

SEQ ID NO: 20 37 aaatt cgcca— gtagaaaagaacagc 1 

SEQ ID NO: 21 27 cattttcgatcagttct—ctgaag—ctcgccagacaggctttctcacc 

-SEQ ID NO: 17 89 gta-aagcgaaaa-aagaaggccgcctcgttt-gctggtccgcttctgtc 

SEQ ID NO: 19 71 ggg-aaccgaaag«aaagaggagaaaaagtag-gctggtcctcttc — ca 

SEQ ID NO: 29. 63 atataagcaaaaagaagaaggtagaaaagttt ttggaatgttctgtg 

SEQ ID NO: 21 73 gtc-atggatctc-aaggag— cgcggcattccgctggt tggc 



ID NO: 17 136 gctcctccggaattctgcacggctatggacatcgccatcgtc — tatccg 

SEQ ID NO: 19 116 agttcccctgcgaactggctgaatcttttcggctgcatgttgggtatccg 

SEQ ID NO: 20 110 cct atgttcca atagaaat aat~tt--tagcag 

SEQ ID NO: 21 112 act — tactgcacctttatg ccgcaagag atccc 

SEQ ID NO: 17 184 gaaactca — cgcagctggtatcggtgcc cgtcacggtg 

SEQ ID NO: 19 is€ gaaaacca— ggetgctggtatcgctgccaaccgtgacggcgaagtgatg 

SEQ ID NO: 20 140 caaatgcaatcccagttggtttgtgtgga ggtaaaaat 

SEQ ID NO: 21 144 ga -t— ggcagc cggtgcg -gtt gtg 

SEQ ID NO : 17 .221 ,-~~r-~==r-j=~ ---~^----^--~~---~ctccqgce&tgc 

SEQ ID NO: 19 214 tgccaggctgcagaagatatcggttatgacaacgatatctgcggctatgc 

SEQ ID NO: 20 178 - gacacaa 

SEQ ID NO:21 166 gtttcgctctgt 

SEQ ID NO: 17 233 tcgaagt-t gctg aaaa — 

SEQ ID NO: 19 264 ccgtatt-tccctggcttatgctgccgggttocggggtgccaacaaaatg 

SEQ ID NO: 20 185 tcccaat-a gcag a 

SEQ ID NO: 21 178 tccacctct— gatg-----""— aaac — 

SEQ ID NO: 17 249 — caaaggttacaaccaggacatctgttcctactgccgcgtcaacatg — 

SEQ ID NO:19 313 gacaaagatggcaactatgtcatcaacccccacagcggcaaacagatgaa 

SEQ ID NO:20 198 ggaggat-ttgccaagaaacctatgcc cattaata 

SEQ ID NO: 21 195 ca™ttgaagaagcggagaaagat ctgccgcg-caacct- — 

SEQ ID NO: 17 295 ggctacatggaactc — ctcaaacagcag 

-SEQ -ID -NO: 19 363 agatgccaatggcaaaaaggtattcgacgcagatggcaaacccgtaatcg 

SEQ ID NO: 20 232 aaatc — atccta tg 

SEQ ID NO: 21 231 ctgcccg ctg— attaaa-agca 



SEQ ID NO: 17 322 . 

SEQ ID NO: 19 413 atcccaagaccctgaaaccctttgccaccaccgacaacatctatgaaatc 

SEQ ID N0:20 245 

SEQ ID NO: 21 251 ~ 

SEQ ID NO: 17 322 gctctgac aggcgaaa cgccggaa-gtcctcaa 

SEQ ID NO: 19 463 gctgctctgccggaaggggaagaaaagacccgccgccagaatgccctgca 

SEQ ID NO: 20 245 gttttaa gaa-ggca— aa 

SEQ ID NO: 21 251 gctacggc ttcggcaa ■ aaccg — at 
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SEQ 


ID NO: 17 


SEQ 


ID NO: 19 


SEQ 


ID NO: 20 


SEQ 


ID NO: 21 


SEQ 


ID NO: 17 


SEQ 


ID NO: 19 


SEQ 


ID NO: 20 


SEQ 


ID NO:21 


SEQ 


ID NO: 17 


SEQ 


ID NO:19 


SEQ 


ID NO: 20 


SEQ 


ID NO: 21 

V 


SEQ 


ID NO: 17 


SEQ 


ID NO:19 


SEQ 


ID N0:20 


SEQ 


ID NO: 21 


SEQ 


ID NO: 17 


SEQ 


ID NO: 19 


SEQ 


ID NO: 20 


SEQ 


ID NO:21 


SEQ 


ID NO: 17 



354 aaactccccggcttctccgattccccttccggatgttgtcctcacttgca 
513 caaatatcgtcagatgaccatgcccatgccggacttcgtgctgtgctgca 

261 aacctgccc — ttactttgaagcatct gatatagttat-tggagaa 

274 aaatgoccctacttct : acttttcggatctggtggtc— -ggtg 

404 acaacatctgca ataccttgctcaaatggtatgaaaacttgg- 

563 acaacatctgca actgcatgaccaaatggtatgaagacattg- 

304 actacctgtgaaggaaagaagaagatgtttgagttgatggagagattggt 
314 aaaccacctgcg acggcaaaaagaaaatgtatgaatacatgg- 

4 46 -ctaaagaattgaac- — gtacctctca~~tcaacatcgacgtac~c 

605 -cccgtcggcacaac attcctttga tcatgatcgacgttc — c 

354 gccaatgcatataat gcacctcccacacatgaaagatgaagatt — c 

356 -c- — ggagtttaagcctgttcatgtga~~tgca-attgoccaacagc 

4 86 gttca — accatgaattc cctg — tta-cgaa ac — acgctaa 

-645 ttaca — ac gaattcgaccatg — tcaacgaa gccaacgtgaa 

399 tttga — a aatct ggat — taa-agaagttgaa — aagctaa 

397 gttaaggacgatgcctcg — -cgtgcgtta-tgga a 

522 acagtacatcgtcg gcgaattcaaacatgctatca aacagc 

684 a tacatccggt cccagc tgga tacggccatcc — —gtcaaa 

434 — aagaat tggtt gagaaagagac tggaaataaaataacagaggaaaagt 
42 9' — agccgagatgc t gcgct tgcaa a aaacgg 

563 tcgaagacctttgcggccgtcccttcgactatgacaaattcttcgaagta . 

SEQ ID NO: 19^ 722 taqaa^aaatcaccgqcaaqaagttcqatqaagacaaattc gaa 

SEQ ID NO: 20 482 taaaaga gacagttgat — aaagta 

SEQ ID NO: 21 458 tagaagaacgttttgggcacgagattagcgaagatgctctgcgcgatgcc 

SEQ ID NO: 17 ^13 cagaaacagacacagcgctc-catcg — ctgcc tggaacaaaat 

SEQ ID NO: 19 766 cag-tgctgccagaacgc-c-aaccgtactgccaaagcatggctgaaggt 

SEQ ID NO: 20 505 aataaagttagggag 1 — tgttttataaa 

SEQ ID NO: 21 508 attgcgctgaaaaaccgcgaacgtcg — cgcac "tgg ctaat 

SEQ ID NO: 17 654 cgctacgtacttc — c— -agtacaaaccgtcgccgctcaacggcttcgac 

SEQ ID NO: 19 813 ttgcgactacctg — c — agtacaaaccggctccgttcaacgggttcgac 

SEQ ID NO: 20 532 ctctatgaattga — ggaagaataaaccagctccaattaagggtttagat 

SEQ ID NO: 21 547 ttttatcatcttgggc — agttaaatcctccggcgcttagcggcagcgac 

SEQ ID NO: 17 700 ctcttcaactacatgggcctcgccg-ttgctgcccgctccttgaactact 

SEQ ID NO: 19 859 ctgttcaaccatatggctgacgtgg-ttaccgcccgtggccgtgtggaag 

SEQ ID NO: 20 580 gttttaaaattattccagtttgcctatttattggatattgatgacacaat 

SEQ ID NO: 21 595 attctga aagtggtttacggcg-caaccttccggttcgataaagagg 

SEQ ID NO: 17 749 cggaaatcacgttcaacaaattcctcaaagaattggacgaaaaagtagc- 

SEQ ID NO: 19 908 ctgctgaagctttcgaactgctggccaaggaactggaacagcatgt 

SEQ ID NO: 20 630 agggatt ttagaggatttaattgaggagttagaggagagagtt 

SEQ ID NO: 21 641 eg ttgafccaatgaactggatgcaatgaccgcc 

SEQ ID NO: 17 798 taataagaaatgggctttcggtgaa aacgaaaaatcccg 

SEQ ID NO: 19 954 — ■ gaaggaaggcaccaccaccgctcccttcaaagaacagcatcg 

SEQ ID NO: 20 673 aaaaaaggagaaggttatgaaggaa agagaa 

SEQ ID NO: 21 673 cgcgttcgtcagcagtgggaagaag — gec agcgactggacccg 

SEQ ID NO: 17 837 tgttacttgggaaggta-tcgctgtctggatcgctctcggccacacc 

SEO ID NO: 19 996 tatcatgttcgaaggga-tcccctgctgg — ccgaaactgccgaacc 

SEQ ID NO: 20 704 — ? ttttaataac-tggctgfcc-caatggttgctggaaacaataag 

SEQ ID NO: 21 715 cgt~ccgcgcattttaatcaccggctg cccgattggcggcgc 
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SEQ ID NO: 17 883 1 — fccaaagaactca — aaggtcagggcgctctcatgactggtfccc 

SEQ ID NO: 19 1040 tgttcaaaccgctga—aagccaacggcctgaacatcaccggcgtt 

SEQ ID NO:20 745 attgt— tgaaattattgaggaagtt— ggaggagtagttgttggfcgaa 

SEQ ID NO: 21 7S6 agcaga — aaaagtggtgcgcgcgattgaagagaatg 

SEQ ID NO: 17 925 gcttat cctggcatgtgggacgtttcctacgaacc ggg- 

SEQ ID NO: 19 1084 gtatatgctcctgctttcgggttcgtgtacaacaacct gga- 

SEQ ID NO:20 790 g — aaa gctgcactggaacaagattctttgaaaactttgttgaggg- 

SEQ ID NO: 21 791 gc g gctgggttgtcggttatgaaaactgcacc gggg 

SEQ ID NO: 17 963 cga cctcg-aatccatggcagaa gcttattcccgtac 

SEQ ID NO: 19 1125 cga attgg tcaaagcctact gcaaagccccgaac 

SEQ ID NO: 20 834 ctatagcgtag-aggacattgcaaaa agata cttta 

SEQ ID NO: 21 827 cgaaagcga ccgagcaatgcgtggcagaaacgggcgatgtctacgac 

SEQ ID NO: 17 999 atac atcaactgctgcct cgaacagcgcggtgct 

SEQ ID NO: 19 1159 -tec -gtca gcat cgaacagggtgttgcc 

SEQ ID NO: 20 869 aaat— ~ — cccatgtgcttgtagatttaaaaacgatgagagagttgaa 
SEQ ID NO: 21 874 gcgctggcggataaatatctggc — . gattggctgctcct 

SEQ ID NO: 17 1033 gttcttgaaaaagttgtccgcgatggcaaatgcgacggc-ttgatcatgc 

SEQ ID NO: 19 1186 tggcgtgaaggcctgatccgcgacaacaaggttgacggc-gtactggttc 
SEQ ID NO: 20 913 aatataaagagattggttaaagagttggacgtcgatggagttgtttat — 
SEQ ID NO:21 911 gtgtttcgccgaacgatcagcgcctgaaaatgc-tcagc-cagatggtgg 

SEQ ID NO: 17 1082 accagaacc-gttcctgcaagaacatgagcctcctcaacaacgaaggcg- 

SEQ ID NO: 19 1235 actacaacc-ggtcctgcaaaccctggagcggctacatgcctgaaatgc- 
SEQ ID NO: 20 961 — - tacac-tttgcagtattgccat-— acatttaacatagagggagc 
SEQ ID NO: 21 959 aggaatatcaggtcgatggcgtagttga — -—tgtgattttgcaggcgt 

SEQ ID NO: 17 1130 gccagcgcatc-cagaagaacctc — ggcgtaccgtacgtcatcttc 

SEQ ID NO: 19 1283 agcgtcgtttc-accaaagacatg — ggtatccccactgctggattc 

SEQ ID NO: 20 1002 taaggtagaggagg-cattaaaagaggagggcattccaattataagaatt 

SEQ ID NO: 21 1004 gccatacctacgcggtggaatcgc — tggcgattaaacgtoatgtgc 

SEQ ID NO: 17 1174 gacggcgaccagaccgatgctcgtaacttctcggaagca 

SEQ ID NO: 19 1327 gacggtgaccaggctgacccgagaaacttcaacgcggct—™- — ~~ 

SEQ ID NO: 20 1051 gaaactgactattctga aagtgatag — agag 

SEQ ID NO:21 1049 gccagc-agcacaacattccttatatcgctattgaaacagactactccac 

SEQ ID NO: 17 1213 « cagttcgatacccgcgtagaagctttggcagaaatga 

SEQ ID NO: 19 13B6 cagtatgagacccgtgttcagggcttggtcgaagcca 

SEQ ID NO: 20 1081 cagttaaaaacaaggttggaggcatttattgagatga 

SEQ ID NO: 21 1098 ctcggatgtcgggcagctcagtacccgtgtcgcggcctttattgagatgc 

SEQ ID NO: 17 1250 tggcagacaaaaaagccaatgaaggaggaaaccactaa 

SEQ ID NO: 19 1403 tggaag-caaatgatgaaaagaagg-ggaaataa— ~ 

SEQ ID NO: 20 1118 t ttaa 

SEQ ID NO: 21 1148 tgtaa 
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Figure 17 

SEQ ID NO: 18 1 — mseektvdiesrosskealgyf lpkvdedarkakkegrlvcwsasvapp 

SEQ ID NO: 22 1 mpktvs pgvqalrdvvekvyrelrepkergekvgwssskfpc 

SEQ ID NO: 23 1 mmklka— ieklmqkf a srkeqlykqkeegrkvfgm . 

SEQ ID NO: 24 1 mslvtdlpaifdqf searqtg-f ltvmdlkergiplvg 

SEQ ID NO: 18 49 efctaradiaivypethaag igarhgapamlevaenkgynqdicsycr 

SEQ ID NO: 22 43 elaesfrlhvgypenqaag iaanrdgevmcqaaedigydndicgyar 

SEQ ID NO: 23 35 -fcayvpieiila-anaip vglcggkndtipiae-edlprnlcplik 

SEQ ID NO:24 38 tyctfropqei pmaagawvslcstsdetieeae-kdlprnlcplik 

SEQ ID NO: 18 96 vnmgym 

SEQ ID NO: 22 90 islayaagfrgankmdkdgnyvinphsgkqmkdangkkvfdadgkpvic^p 

SEQ ID NO:23v 79 ssygf — 

SEQ ID NO: 24 83 ssygf — 

SEQ ID NO: 18 102 ellkqqaltgetpev — ■ lknspaspiplpdwltcnn 

SEQ ID NO:22 140 ktlkpf attdniyeiaalpegeektrrqnalhkyrcpatmpmpdfvlccnn 

SEQ ID NO: 23 84 kkaktcpyfeasdiviget 

SEQ ID NO: 24 88 gktdkcpyf y fsdlwg-et 

SEQ ID NO: 18 137 icntllkwyenlakelnvplinidvpfnhefpvtkhakqyivgef ichaik 

SEQ ID NO: 22 190. icncaaikwyediafrhnipllnd 

SEQ ID NO: 23 103 tcegkkkmfelm--erlvpmhimhlphmkd edslkiwikeveklke 

SEQ ID NO: 24 107 tcdgkkkmyeymaef kpvhvmqlpnsvkdd asralwkaemlrlqk 

SEQ ID NO:18 187 ql^lcgrpfdycUf f^~-vq^ 

SEQ ID NO:22 240 cmoeeitgkkf dedkfeq-~ccqnanrtakawlkvcdylqykpapfngfd 

SEQ ID NO: 23 147 lveketgnkiteeklke tvdkvnkvrelfyklyelrknkpapikgld 

SEQ ID NO: 24 152 tveerfgheisedalrdaialknrerralanfyhlg qlnppalsgsd 

SEQ ID NO: 18 234 lfnymglavaarslnyseitfnkf lkeldekvan— kkwaf ge--n- 

SEQ ID NO: 22 287 lfnhmadwtargrveaaeaf ellakeleqhvke—gtttapf — k- 

SEQ ID NO: 23 194 vlklfqfaylldiddtigile dlieeleerv kk ge— gy 

SEQ ID NO: 24 199 ilk wygatfrfdk ealineldamtarvrqqweegqrld- 

7SSQ ID NO: 18 276 eksrvtwegiavwialg^ say pgmwdvsy 

SEQ ID NO: 22 329 eqhrimfegipc%»pklpnlfkplkanglxiitg wy apafgfvy 

SEQ ID NO: 23 231 egkrilitgcpmvagnnkiveiieevggwvg eesctgtrf fenfv 

SEQ ID NO: 24 238 prprilitgcpiggaaekwraieenggwwgyenctga kateqcva 

SEQ ID NO: 18 319 epgdl-esmaeaysrtyinccl — eqrgavlekvyrdgkcdgllmhqnrs 

"SEQ ID NO: 22 372 --nnl-delvkayckaphsvsl--eqgvaw 

SEQ ID NO: 23 277 egysv-ediakr yf kipcacrf kndervenikrlvkeldvdy v vyytlqy 

SEQ ID NO: 24 28S etgdvydaladkylaigcscvspndqrlkmlsqpaveeyqvdgvvdvilqa 

SEQ ID NO: 18 366 cknmsllnnegg— qriqknlgvpyvifdgdqtdarnf seaqfdtrveal 

SEQ ID NO: 22 417 ckpwsgympemq — rrftkdmgiptagfdgdqadprnfnaaqy«trvqgl 

SEQ ID NOV23 326 -cht fniegakveealkeegipiirietdyses dreqlktrleaf 

SEQ ID NO: 24 335 chtyaveslaik— -rhvrqqhnipyiai etdystsdvgqlstrvaaf 

SEQ ID NO: 18 414 aemmadkkaneggnh 

SEQ ID NO: 22 465 veameandekkgk— 

SEQ ID NO: 23 370 iemi 

SEQ ID NO:24 380 ieml 
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Figure 18 



ATGAGT€AGATC<SAC<3AACTTATCAGCAAATTACAGGAAGTATCCAACCATCCCCAGAAG 
ACGGTTTTGAATTATAAAAAACAGGGTAAAGGCCTGGTAGGCATGATGCC<iTACTAG<3CT 
CCGGAAGAAATCGTATATGCTGCAGGCTAGCTGCCGGTAGGCATGTTCGGTTCGCAGAAC 
CCGCAGATCTGCGCAGCTCGTACGTAGCTTCCTCCGTTCGCTTGCTCCTTGATGCAGGCT 
GACATGGAACTCCAGCTCAACGGCACCTATGACTGCCTCGAGGCTGTTATCTTCTGCGTT 
CCTTGCGACACTCTCCGCTGCATGAGCCAGAAATGGCACGGCAAAGCTCCGGTCATCGTC 
TTCACACAGGCGCAGAAGCGTAAGATCCGCGCGGCTGTCGATTTCCTCAAAGCTGAATAC 
GAACATGTCCGTACGGAATTGGGACGTATCCTCAACGTAAAAATCTCCGACCTGGCTATC 
CAGGAAGCTATCAAAGTATATAACGAAAAGCGTCAGGTTATGCGTGAATTCTGCGACGTA 
GCTGCTCAGTACCCGCAGATCTTCACTCCGATAAAACGTCATGACGTCATCAAAGCCCGC 
TGGTTCATGGAGAAAGCTGAACACACCGCTTTGGTCCGCGAACTCATGGACGCTGTCAAG 
AAAGAACCGGTACAGCCGTGGAATGGCAAAAAAGTCATCCTCTCCGGTATCATGGCAGAA 
CCGGATGAATTCCTCGATATCTTCAGCGAATTGAAGAT<^^ 

GCTCAGGAATCCCGCCAGTTCCGTACAGACGTACCGTCGGGCATCGATCCCCTCGAACAG 
CTCGCTCAGCAGTGGCAGGACTTCGATGGCTCCCCGCTCGCTTTGAACGAAGACAAACCX5 
CGTGGCCAGATCCTCATGGACATGACTAAGAAATACAATGCTGAGGCCGTCGTCATCTGC 
ATGATGGGTTTCTGCGATCCTGAAGAATTGGACTATCCGATTTACAAAGCGGAATTTGAA 
GCTGCTGGCGTTCGTTACACGGTCCTCGAGCTCGACATCGAATCTCCGTGCCTCGAACAG 
CTCCGCAGCCGTATCCAGGCTTTCTGGGAAATCCTCTAA (SEQ ID NO: 25) 
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Figure 19 

MSQIOELISKLQEVSNHPQKTVLNYKKQGKGLV<sMMPYYAPEEIVYAAt3YLPVGMF-GSQN~ 
PQISAARTYLPPFACSLMQAW5ELQLNGTYDCLDAVIFSVPCDTLRCMSQKWHGKAPVIV 
FTQPQNRKIRPAVOFLKAEYEHVRTEL<3RILNVKISDLAIQEAIKVYNENRQVMREFCDV 
AAQYPQIFTPIKRHOVIKARWFMDKAEHTALVRELIDAVKKEPVQPWNGKKVIL*SGIMAE 
PDEFLDIFSEFNIAWADDLAQESRQFRTDVPSGIDPLEQLAQQWQDFDGCPLALNEDKP 
RGQMLIDMTKKYNADAWICMMRFCDPEEFDYPIYKPEFEAAGVRYTVLDLDIESPSLEQ 
LRTRKJAFSEIL (SEQ ID NO: 26) 
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Figure 20 

SEQ ID NO: 25 1 atgagtcagatcgacgaacttatcagcaaat^acaggaagtatccaacca 

SEQ ID NO: 27 1 atggct atcagtgcacttattgaagagttccaaaaagtat-^ctgcca 

SEQ ID NO: 28 1 atgatgaaattaaaggcaattgaaaagttgatgcaaaaat 

SEQ ID NO: 2 9 1 atgtcacttgtcaccgatctacccgccattttcgatcagtt<:tctgaagc 

SEQ ID NO: 25 51 tccccagaag ac ggttttg aattataaaaaa 

SEQ ID NO: 27 47 gccc — gaag ac catgctggccaaatataaagcc 

SEQ ID NO: 28 41 tcgccagtag aaaagaacagctatat aagcaaaaagaa 

SEQ ID NO: 29 51 tcgccagacaggctttctcac cgtcatg — -gatctcaaggag 

SEQ ID NO: 25 82 cagggtaaaggcctcgtaggca — tgatgcoctactacgctccggaagaa 

SEQ ID NO: 27 79 cagggcaaaaaagccatcggct — gcctgccgtactatgttccggaagaa 

SEQ ID NO: 28 79 gaaggtagaaaagtttttggaa — tgttctgtgcctatgttccaatagaa 

SEQ ID NO: 29 91 cgcggcattccgctggttggcacttactgcacctttatgc — cgcaagag 

SEQ ID NO: 25 130 atcgtatatgctgcaggctacctcccggtaggcatgt tcggtfcccca 

SEQ ID NO: 27 127 ctggtctatgctgcaggcatggttcccatgggtgtat ggggctgcaa 

SEQ ID NO: 28 127 ataattttagcagcaaatgcaatcocagttggtttgt gtggaggtaa 

SEQ ID NO: 29 139 atcccgatggcagccgg tgcggttgtggtttcgctctgttccac 

SEQ ID NO: 25. . 177: -r~~rgaacccgcag-atctccgcagctcgtacgtaccttoctccgtt 

SEQ ID NO: 27 174 tggcaaacaggaaqt«gttccaaggaa-tactgtgctt<:ctt 

SEQ ID NO: 28 174 — aaatgacaca-atcccaatagcagaggaggatttgccaagaaa 

SEQ ID NO: 29 183 ctctgatgaaacc— — attgaagaagcggagaaagatctgccgcgcaa 



SEQ ID NO: 25 219 cgcttgctccttgatgcaggctgacatggaactccagctcaacggca 

SEQ ID NO: 27 216 ctactgcaccattgcccagcagtctctggaaatgctgctggacggga--- 

SEQ ID NO: 28 216 cctatgcccattaataaaatcatcctatggttttaag aaggca 

SEQ ID NO: 29 228 cctctgcccgctga ttaaaagcagctacggct — tcggcaaaa 

SEQ ID NO: 25 266 cctatgactgccfccgacgctgttatcttctec gttcct-tgcg 

SEQ ID NO: 27 263 ccctggatgggttggacgggatcatca-ctcc ggtactgtgtg 

SEQ ID NO: 28 259 — aaaacctgcccttactttg-aagcatctgatatagttatt-ggag 

SEQ ID NO: 29 269 ccgataaatgcccctac ttctacttttc ggatct-ggtggtc 

SEQ ID NO: 25 308 acactctecgctgcatgagecagaaat gg — c- 

SEQ ID NO: 27 305 ataccctgcgtcccatgagccagaacttcaaagtgg-- — ~~cc 

SEQ ID NO: 28 302 aaact acctgtgaa -gg a- 

SEQ ID NO: 29 310 ggtgaaaccacctgcgacggcaaaaagaaaa tgtatgaatac- 

SEQ ID NO:25_ 338 - — ~acggcaaagct- ccggtcatcg-tcttcacacagccgcagaac 

SEQ ID NO: 27 343 atgaaagacaagatg ccggttattt-tcctggctcatccccaggtc 

SEQ ID NO: 28 319 aagaagaagat gtttgagttgatggagagattggtgecaatg 

SEQ ID NO: 29 352 atggcggagtttaagcctgttcatg-tgatgcaattgcocaacagc 

SEQ ID NO: 25 379 cgtaaga-tccgcccggc tgtcgatttcctcaaag-ct 

SEQ ID NO: 27 388 cgtcagaatgccgccggc • — — aagc-agttcacctatg-at 

SEQ ID NO: 28 361 catataa-tgcacctcccacacatgaaagatgaagattctttgaaaatct 

SEQ ID NO:29 397 gttaagg-acgatgcctc — gcgtgcgttatggaaag-cc 

SEQ ID NO:2S 415 gaat — acgaacatgtc cgt acgg — aattgg— — gacg 

SEQ ID NO: 27 424 gcct — acagcgaagt ga aaggccatctgg aaga 

SEQ ID NO: 28 410 ggattaaagaagttgaaaagcta aaag — aattggttgagaaa 

SEQ ID NO: 29 433 ga gatgctgcg cttgcaaaaaacgg — tagaag aacg 
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SEQ ID NO: 25 447 tatcctcaacgtaaaa — atcfcccgacctggctatccaggaagctatcaa 

SEQ ID NO: 27 456 aatctgcggccatgaa — atcaccaatgatgccatcctggatgpcatcaa 

SEQ ID NO: 28 451 gagactggaaataaaataacagaggaaaagttaaaagagac^agttgataa 

SEQ ID NO: 29 468 ttttgggcacg ag — attagcgaagatgctctgcgcgatgccattgc 

SEQ ID NO: 25 495 agtatataacgaaaaccgtcaggttatgcgtgaattct — ■ gcg 

SEQ ID NO: 27 504 agtgtacaacaagagccgtgctgcccgccgcgaattct gca 

SEQ ID NO:28 501 agtaaataaagtta gggagttgttttataaactct atg 

•SEQ ID NO: 29 513 gctgaaaaaccgcgaacgtcgcgcactggctaatttttatcatcttgggc 

SEQ ID NO: 25 536 acgtagctgctcag taoccgcagatcttcactccgataaa — acg 

SEQ ID NO: 27 545 aactggc — caacg aacatcctgatctgatcccggcttccgtacg 

SEQ ID NO: 28 539 a-attgaggaagaa«^~taaac-cag — -ctccaattaa — ggg 

SEQ ID NO: 29 5"63 agttaaatcctccggcgcttagcggcag — cgacattctgaaagt — ggt 

SEQ ID NO: 25 579 tcatgacgtcatc aaag eccgctgg ttca 

SEQ ID NO: 27 588 ggccaccgtactg : cgtg ccgcttac ttca 

SEQ ID NO: 28 573 tttagatgtttta aaattattccagtttgcctatttat 

SEQ ID NO: 29 609 ttacggcgcaaccttccggttcgataaag aggcgttg : — atca 

SEQ ID NO: 25 €08 tggacaaagctgaacacaccgctttggtccgcgaactcatcgacgctgtc 

SEQ ID NO: 27 617 tgctgaaggatgaatacaccgaaaagctggaagaactgaacaagg 

SEQ ID NO: 28 -6nTt~ggatral^^ 

SEQ ID NO:29 650 atgaactggatgcaatgaccgc ccgcg — ttcgtcagcagtggg 

SEQ ID NO: 25 658 aagaa ag— aaccggtacagccgtggaat ggcaaaaaa 

SEQ ID NO: 27 662 aactg- gc — agctgctcctgccggcaagttcgacggccacaaa 

SEQ IDNO:28 661 gaggagagagttaa — aaaaggagaaggttatgaa ggaaagaga 

SEQ ID NO: 29 €92 aagaa ggccagcgactggacccgcgtccg cgcatttta 

SEQ ID NO: 25 694 gtcatcctctccggt atcatggcagaaccggatgaattcct 

SEQ ID NO: 27 703 gtggttgtttccggc atcatctacaacacgcccggcatcct 

SEQ ID NO: 28 703 attttaataactggctgtccaatggttgctggaaacaataagattgt 

SEQ ID NO: 29 730 atcaccggctgcccg attggcggcgcagcagaaaaagtggtgcg 

SEQ ID NO: 25 735 cgatatcttcagcgaatt-caacatcgctgtcgtcgctgacgacctc-gc 
SEQ IDNO:27 744 gaaagccatggatgacaa-caaactggccattgctgctgatgactgc-gc 
SEQ ID NO: 28 750 tgaaattattgaggaagt-tggaggagtagttgttggtgaagaaagctgc 
SEQ ID NO: 29 774 cgcgat-tgaagagaatggcggctgggttgtcggttatgaaaactgc-ac 

SEQ ID NO:25 783 tcagga-atcccgccagttccg^acagacgtaccgtccggcatcgatccc 

SEQ ID NO: 27 792 ttatga-aagccgcagctttgccgtggatgctccggaagatctgga c 

SEQ ID NO: 28 799 actgga-a caagattctttgaaaactttgttgagg — gctatagc 

SEQ ID NO: 29 822 cggggcgaaagcgaccgagcaatgc-gtggcagaaacggg— cgatgtc 

SEQ ID NO: 25 832 ctcgaacagctcgctcag cagtgg-- caggacttcgat-g 

SEQ ID NO: 27 838 aacggactgcatgctctggctgtacagttctccaaacagaagaacgat-g 

SEQ ID NO:28 841 gtagaggacattgcaaa aaga-tacttt-a 

SEQ ID NO: 29 868 tacgacgcgctggcggat aaatat ctgg -cgattg 

SEQ ID NO: 25 869 gctgcccgctcgctttgaa cgaagacaaaccgcg-tggccag 

SEQ ID NO: 27 887 ttctgctgtacgatcc tgaatttgccaagaatacccgttctgaacac 

SEQ ID NO: 28 869 aaatcccatgtgcttgta gatttaaaaacgat-gagagag 

SEQ ID NO: 29 902 gctgctc-ctgtgtttcgc cga--acgatcagcg-cctgaaa 

SEQ ID NO: 25 910 atgctcatcgaca tgactaagaaatacaatgctgacgccgtcgtc 

SEQ ID NO: 27 934 gttggca ate tggtaaaagaaagcggcgcagaaggactgatc 

SEQ ID NO: 28 908 ttgaaaatataaagagattggttaaagagttggacgtcgatggagttgtt 
SEQ ID NO: 29 940 atgctcagccaga tggtggaggaatatcaggtcgatggcgtagtt 
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SEQ ID NO: 25 955 atctgcatgatgcgt-ttctgcgatcctgaagaattcgactatc cgat 

SEQ ID NO: 27 976 gtgttcatgatgcagttctgcgatccggaagaaatggaatatc ctga 

SEQ ID NO: 28 958 tattacactttgcagtattgccatacatttaacatagagggag ctaa 

SEQ ID NO:29 985 gatgtgattttgcaggcgtgccatacctacgcggtggaatcgctggcgat 

SEQ ID -190:25 1002 ttacaaaccggaatttgaagctgctgg cgttcgttacacggtcctc 

SEQ ID NO: 27 1023 tctgaagaaggctctggatgcccacca cattcctcatgtgaagatt 

SEQ ID NO: 28 1005 ggtagaggaggcattaaaagaggaggg cattc— caattata 

SEQ ID NO: 29 1035 t aaacgtcatgtgcgccagcagcacaacattccttatatcgctatt 

SEQ ID NO: 25 1048 gacctcgacatcgaatctccgtccctcgaa cagctccgcacccg 

SEQ ID NO: 27 1069 ggtgtggaccagatgacccgggactttggt — caggcccagaccgc 

SEQ ID NO: 28 1045 agaattgaaactgactattctgaaagtgatagagagcagttaaaaacaag 

SEQ ID NO: 29 1081 gaaacagactactccacctcggatgtcggg cagctcagtacccg 

SEQ ID NO: 25 1092 tatccaggctttctcggaaatcctctaa 

SEQ ID NO: 27 1113 tctggaagctttcgcagaaagcctgtaa 

SEQ ID NO: 28 1095 gttggaggcatttattgagatgatttaa 

SEQ ID NO: 29 1125 tgtcgcggcctttattgagatgctgtaa 



BNSOCCID: <WO G242418A2J_> 
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Figure 21 

1 msqidelisklqevsnhpqk tvlnykkqgkglvgmmpyyapeeivya 

1 -maisalieefqkvsaspkt mlakykaqgkkaigclpyyvpeelvya 

1 mmkl-kaieklmqkfasrke qlykqkeegrkvfgmfcayvpieiila 

1 mslvtdlpaifdqfsearqtgfltvmdlkergiplvgtyctfnqpqeipma 

4 8 agylpvgmf gsqnpqisaartylppf acslmqadmelqlngt ydc — 

47 agmvpmgvwgcngkqevrskeycasf yctiaqqslemlldgt Idg — 

47 anaipvglcggkndtipiaeedlprnlcplikssygfkkaktcpyfea — 
51 agawvslcstsdetieeaekdlprnlcplikss — ygfgkt dkcpy 

93 ldavifsvpcdtlrcmsqkwh gkapvivftqpqnrkirpavdf 

92 ldgiitpvlcdtlrpmsqnfkvamkdkmpviflahpqvrqnaagkqf 

95 sdivigettcegkkkmfelme rlvpmhimhlp-hmkdedslki 

96 fyfsdlwgettcdgkkkmyeyma efkpvhvmqlpnsvkddasral 

136 lkaeyehvrtelgrilnvkisdlaiqeaikvynenrqvmrefcdvaaqyp 
139 tydaysevkghleeicgheitndaildaikvynksraarrefcklanehp 

137 wikeveklkelveketgnkiteeklketvdkvnkvrelfyklyelrknkp 
142 wkaemlrlqktveerfgheisedalrdaialknrerralanfyhlgqlxy? 

1&6 qiftpikrhdvik arwf mdkaehtalvrelidavkk — epvqp 

189 dlipasvratvlr aayf mlkdeytekleelnkelaa — apagk 

187 apikgldvlk lfqfaylldiddtigiledlieeleervkkgeg 

192 palsgsdilkwygatfr fdk ealinel-damta — rvrqq 

227 wn-gkk- -vilsg—imaepdefldif sefniawaddlaqesrqf 

230 fd-ghk msg — iiyntpgilkamddnklaiaaddcayesrsf 

230 ye-gkr ilitgcpmvagnnkiveiieevggvwgeesctgtrf f 

230 weegqrldprprilitgcpiggaaekwraieenggwwgyenctgakat 

268 rtdvpsgidp-leqlaqqwqdfdgcplalned kprg<palidmtkkyn 

271 avdapedldnglhalavqfskqkndvllydpefakntrsehvgnlvkesg 

273 enfv-egya— vediakryf kip-cacrfknd e-rvenikrlvkeld 

280 eqcvaetgdv-ydaladkylai-gcscvspnd q-rlkxalsqmveeyq 

314 adawicnnnrfcdpeefdypiykpef-eaagvrytvldldiespsleqlr 
321 aeglivfmmqfcdpeemeypdlkkal-dahhiphvkigvdqnitrdfgqaq 

315 vdgwyytlqychtfniegakveeal-keegipiirietdysesdreqlk 
324 vdgwdvilqachtyaveslaikrhvrqqhnipyiaietdystsdvgqls 

363 triqafseil 
370 taleafaesl 

364 trleafiemi 
374 trvaafieml 
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Figure 22 

1 CGACGGGCCG GGCTGGTATC ATTCTAGTCA GTAATTCACC TTTGGAAAAT TTTCACAAAG 

*61 GCAGTACGAC AGAAGCGTCG ATACATTCCA TTTAGCAGGA GGAAGTTACG QTAATGAGAA 

121 AAGTAGAAAT CATTACAGCT GAACAAGCAG CTCAGCTCGT AAAAGACAAC GACACGATTA 

181 CGTCTATCGG CTTTGTCAGC AGCGCCCATC CGGAAGCACT GACCAAAGCT TTGGAAAAAC 

241 GGTTCCTGGA CACGAACACC -CCGCAGAACT TGACCTACAT CTATGCAGGC TCTCAGGGCA 

301 AACGCGATGG CCGTGCCGCT GAACATCTGG CACACACAGG CCTTTTGAAA CGCGCCATCA 

361 TCGGTCACTG GCAGACTGTA CCGGCTATCG GTAAACTGGC TGTCGAAAAC AAGATTGAAG 

421 CTTACAACTT CTCGCAGGGC ACGTTGGTCC ACTGGTTCCG CGCCTTGGCA GGTCATAAGC 

481 TCGGCGTCTT CACCGACATC GGTCTGGAAA CTTTCCTCGA TCCCCGTCAG CTGGGCGGCA 

541 AGCTCAATGA CGTAACCAAA GAAGAGCTCG TCAAACTGAT CGAAGTCGAT GGTCATGAAC 

601 AGCTTTTCTA CCCGACCTTC CCGGTCAACG TAGCTTTCCT CCGCGGTACG TATGCTGATG 

661 AATCCGGCAA TATCACCATG GACGAAGAAA TCGGGGCTTT CGAAAGCACT TCCGTAGGCC 

721 AGGCCGTTCA CAACTGTGGC GGTAAAGTCG TCGTCCAGGT CAAAGACGTC GTCGCTCACG 

781 GCAGCCTCGA CCCGCGCATG GTCAAGATCC CTGGCATCTA TGTCGACTAC GTCGTCGTAG 

841 CAGCTCCGGA AGAOCATCAG CAGACGTATG ACTGCGAATA CGATCCGTCC CTCAGCGGTG 

901 AACATCGTGC TCCTGAAGGC GCTAOCGATG CAGCTCTCCC CATGAGCGCT AAGAAAATCA 

961 TCGGCCGCCG CGGCGCTTTG GAATTGACTG AAAACGCTGT CGTCAACCTC GGCGTCGGTG 

1021 CTCCGGAATA CGTTGCTTCT GTTGCCGGTG AAGAAGGTAT CGCCGATAGC ATTACCCTCA 

1081 CCGTCGAAGG TGGCGCCATC GGTGGCGTAC £GCAGGGCGG TGGCCGCTTC GGTTCGTGCC 

1141 GCAATGGCGA TGCCATCATC GACCACACCT ATCAGTTCGA CTTCTACGAT GGCGGCGGTC 

1201 TGGACATCGC TTACCTCGGC CTGGCCCAGT GCGATGGCTC GGGCAACATC AACGTCAGCA 

1261 AGTTCGGTAC TAACGTTGCC GGCTGCGGCG GTTTCCCCAA CATTTCCCAG CAGACACCGA 

1321 ATGTTTACTT CTGCGGCACC TTCACGGCTG ^CGGCTTGAA AATCGCTGTC GAAGACGGCA 

1381 AAGTCAAGAT CCTCCAGGAA GGCAAAGCCA AGAAGTTCAT CAAAGCTGTC GACCAGATCA 

1441 CTTTCAACGG TTCCTATGCA GCCCGCAACG GCAAACACGT TCTCTACATC ACAGAACGCT 

1501 GCGTATTTGA ACTGACCAAA GAAGGCTTGA AACTCATCGA AGTCGCACCG GGCATCGATA 

1561 TTGAAAAAGA TATCCTCGCT CACATGGACT TCAAGCCGAT CATTGATAAT CCGAAACTCA 

1621 TGGATGCCCG CCTCTTCCAG GACGGTCGCA TGGGACTGAA AAAATAAATC TCTGCTGTAA 

1681 AGGAGACTTT ACTATGAAAC CAATGAGACT ACATCACGTA GGCATTGTGC TGCCGAGCTT 

1741 AGAAAAAGCC CATGAATTCA TGCAGAATAA TGGACTTGAA ATCGACTATG CCGGCTATGT 

1801 CGATGCTTAC CAGGCTGATC TCATTTTCAC TAAGTTTGGT GAATTTGCCA GCCCGATTGA 

1861 AATGATTATC CCGCACTCCG GTGTGCTTAC CCAATTCAAT GGTGGGCGCG -GCGGCATTGC 

1921 CCACATCGCC TTCGAAGTGG ACGATGTCGA AGCTGTGCGC CAGGAAATGG AAGCAGATTG 

1981 TCCGGGATGC ATGTTAGAAA AGAAAGCTGT CCAGGGTACG GACGACATTA TCGTCAACTT 

2041 GCGCCGCCCG ACAACCAACC AGGGTATCCT CGTTGAATAT GTTCAGACGA CAGCACCTAT 

2101 CACCGGCCGC GGCGAAAATC CTTTCGTTAA GAATCTGGGC CCGGAAAAAG GGAAGCTCAA 

2161 CGAAACATGG CATCCCATGC GCCTGCACCA TATCGGCATC GTCTTGCOGA -CCTTGGAAAA 

2221 GGCCCATGAA TTCATCAAGA CCAATGGTCT GGAAGTGGAT TATTOCGGTT TCGTCGACGC 

2281 CTACCATGCG GATCTCATTT TCACTAAAAA AGGTGAAAAC AGTACGCCTA TCGAATTCAT 

2341 TATTCCCCGT GAAGGGGTCC TCAAAGATTT CAATCATGGC AGGGGAGGTA TCGCTCATAT 

2401 CGCCTTTGAA GTGGATGATG TCGAAAAGGT ACGTCAGATT ATGGAAAGCC AGAAGCCTGG 
2461" TTGCATGCTC GAAAAGAAAG CCGTCCGGGG AACGGACGAT ATCATCGTCA ACTTCCGCCG 

2521 . TCCCAGCACG GACGCGGGCA TCCTCGTCGA ATATGTCCAG ACCGTAGCTC -CCATCAATCG 

2581 CAGCAATCCC AAGCCTTTTA ATGATTGATT TTTTATAAAG AAAGGTGAAA ACTGTGTATA 

2641 CTCTCGGAAT CGACGTTGGT TCTTCTTCTT CCAAGGCAGT CATCCTGGAA GATGGCAAGA 

2701 AGATCGTGGC CCATGGCGTC GTTGAAATCG GCACCGGTTC GACCGGTCCG GAACGCGTCC 

2761 TGGACGAAGT CTTCAAAGAT ACCAACTTAA AAATTGAAGA CATGGCGAAC ATCATCGCCA 

2821 CAGGCTATGG CCGTTTCAAT GTCGACTGCG CCAAAGGCGA AGTCAGCGAA ATCACGTGCC 

2881 ATGCCAAAGG GGGCCTCTTT GAATGCCCCG GTAGGACGAC CATCCTGGAT ATCGGCGGTC 

2941 AGGACGTCAA GTGCATCAAA TTGAATGGCC AGGGCCTGGT CATGCAGTTT GCCATGAACG 

3001 ACAAATGCGC CGCTGGTACG GGCCGTTTCC TCGACGTCAT GTCGAAGGTA CTGGAAATCC 

3061 CCATGTCTGA AATGGGGGAC TGGTACTTCA AATCGAAGCA TGCCGCTGCC .GTCAGCAGTA 

3121 CCTGCACGGT TTTTGCTGAA TCGGAAGTCA TTTCCCTTCT TTCCAAGAAT X3TCGCGAAAG 

3181 AAGATATCGT AGCCGGTGTC CATCAGTGCA TCGCCGCCAA AGGCTGCGCT CTGGTGCGCC 

3241 GCGTCGGTGT CGGTGAAGAC -CTGACCATGA CCGGCGGTGG CTCOCGGGAT -GGCGGGGTCG 

3301 TOGASGGGGT ATCGAAAGAA TTAGGTATTC "CTGTCAGAGT TTGCTCTGCAT CCCCAAGCGG 

3361 TGGGTGCTCT CGGAGCTGCT TTGATTGCTT ATGATAAAAT CAAGAAATAA GTCAAAGGAG 
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3421 AGAACAAAAT CATGAGTGAA GAAAAAACAG TAGATATTGA AAGCATGftGC TCCAAGGAAG 

3481 CCCTTGGTTA CTTCTTGCCG AAAGTGGATG AAGAGGCACG TAAAGGGAAA AAAGAAGGCC 

3541 GCCTCGTTTG CTGGTCCGCT TCTGTCGCTC CTCCGGAATT CTGCACGGCT ATfcGACATCG 

3601 CCATCGTCTA TCCGGAAACT CACGCAGCTG GTATCGGTGC CCGTCACGGT ' GCTCCGGCCA 

3661 TGCTGGAAGT TGCTGAAAAC AAAGGTTACA ACCAGGACAT CTGTTCCTAC TGCCGCGTCA 

3721 ACATGGGCTA CATGGAACTC CTCAAACAGC AGGCTCTGAC AGGCGAAACG CCGGAAGTCC 

3781 TCAAAAACTC CCCGGCTTCT CCGATTCCCC TTCCGGATGT TGTCCTCACT TGCAACAACA 

3841 TCTGCAATAC CTTGCTCAAA TGGTATGAAA ACTTGGCTAA AGAATTGAAC GTAGCTCTCA 

3901 TCAACATGGA CGTACCGTTC AACCATGAAT TCCCTGTTAC GAAACACGCT AAACAGTACA 

3961 TCGTCGGCGA ATTCAAACAT GCTATCAAAC AGCTCGAAGA CCTTTGCGGC CGTCGCTTCG 

4021 ACTATGACAA ATTCTTCGAA <3TACAGAAAC AGACACAGCG CTCCATCGCT GCCTGGAACA 

4081 AAATGGCTAC GTACTTCCAG TACAAACCGT CGCCGCTCAA CGGCTTCGAC CTCTTCAACT 

4141 ACATGGGCCT CGCCGTTGCT GCCCGCTCCT TGAACTACTC * GGAAATCACG TTCAACAAAT 

4201 TCCTCAAAGA ATTGGACGAA AAAGTAGCTA ATAAGAAATG GGCTTTCGGT GAAAACGAAA 

4261 AATCGCGTGT TACTTGGGAA GGTATCGCTG TCTGGATCGC TCTCGGCCAC ACCTTCAAAG 

4321 AACTCAAAGG TCAGGGCGCT CTCATGACTG GTTCCGCTTA TCCTGGCATG TGGGACGTTT 

4381 CCTACGAAGC GGGCGACCTC GAATCCATGG CAGAAGCTTA TTCCCGTACA TACATCAACT 

4441 GCTGGCTCGA ACAGCGCGGT GCTGTTCTTG AAAAAGTTGT CCGCGATGGC AAATGCGACG 

4501 GCTTGATCAT GCACCAGAAC CGTTCCTGCA AGAACATGAG CCTCCTCAAC AACGAAGGCG 

4561 GCCAGCGCAT CCAGAAGAAC CTCGGCGTAC CGTACGTCAT CTTCGACGGC GACCAGACCG 

4621 ATGCTCGTAA CTTCTCGGAA GCACAGTTCG ATACCCGCGT AGAAGCTTTG GCAGAAATGA 

4681 TGGCAGACAA AAAAGCCAAT GAAGGAGGAA ACCACTAATG AGTCAGATCG ACGAACTTAT 

4741 GAGCAAATTA- CAGGAAGTAT CCAACCATCC CCAGAAGACG GTTTTGAATT ATAAAAAACA 

4801 GGGTAAAGGC CTCGTAGGCA TGATGCCCTA CTACGCTCCG GAAGAAATCG TATATGCTGC 

4861 AGGCTACCTC GCGGTAGGCA TGTTCGGTTC CCAGAACCCG CAGATCTCCG CAGCTCGTAC 

4921 GTACCTTCCT CCGTTCGCTT GCTCCTTGAT GCAGGCTGAC ATGGAACTCC AGCTCAACGG 

4981 CACCTATGAC. TGCCTCGACG CTGT2ATCIX CTCCCCTGCT--TGCGACACTC..-TCCGCTGCAT 

5041 GAGCCAGAAA TGGCACGGCA AAGCTCCGGT CATCGTCTTC ACACAGCCGC AGAACCGTAA 

5101 X3ATCCGCCCG <3CTGTCGATT TCCTCAAAGC TGAATACGAA CATGTCCGTA CGGAATTGGG 

5161 ACGTATCCTC AACGTAAAAA TCTCCGACCT GGCTATCCAG GAAGCTATCA AAGTATATAA 

5221 CGAAAACCGT CAGGTTATGC GTGAATTCTG CGACGTAGCT GCTCAGTACC CGCAGATCTI 

5281 CACTCCGATA AAACGTCATG ACGTCATCAA AGCGCGCTGG TTCATGGACA AAGCTGAACA 

5341 CACCGCTTTG GTGCGCGAAC TCATCGACGC TGTCAAGAAA GAACCGGTAC AGCCGTGGAA 

5401 TGGCAAAAAA GTCATCCTCT CCGGTATCAT GGCAGAACCG GATGAATTCC TCGATATCTT 

5461 CAGCGAATTC AACATCGCTG TCGTCGCTGA CGACCTCGCT CAGGAATCCC GCCAGTTCCG 

5521 TACAGACGTA CCGTCCGGCA TCGATCCCCT CGAACAGCTC GCTCAGCAGT GGCAGGACTT 

5581 CGATGGCTGC CCGCTCGCTT - TGAAGGAAGA CAAACCGCGT GGCCAGATGC TCATCGACAT 

5641 GACTAAGAAA TACAATGCTG ACGCCGTCGT CATCTGCATG ATGCGTTTCT GCGATCCTGA 

5701 AGAATTOGAC TATCCGATTT ACAAACCGGA ATTTGAAGCT GCTGGCGTTC GTTACAOGGT 

5761 CCTCGACCTC <3ACATCGAAT CTCCGTCCCT CGAACAGCTC CGCACCCGTA TCCAGGCTTT 

5821 CTCGGAAATC CTCTAAGAAT CGCCTGAATC ATCAAACATC TGGGCGGGAC TCCGAAAGGT 

5881 GCCTGCTACA TGATACATTG CCTGTTTTCA GGCAGACAGA TTTGCAGCTT GCGGCCCCCA 

-5941 TTGTACGGGC TGCAAGCTGT CAATGATGCT - TTAAAGACGG CTCTGCCGTT TTTAAATAAA 

6001 AACATAAAAC CATATATAAT CTATTAGGAG GAAACTCAAT CATGGAATTC AAACTTTCTG 

6061 AATTACAGCA AGATATCGCA AATCTCGCAA AAGATTTCGC AGAAAAAAAA TTAGCTCCCA 

6121 CTGTCAAAGA GCGTGACGAA AAAGAAGTTT TCGATCGTGC TATCCTTGAC GAAGTGGGTA 

6181 CTCTCGGCCT TCTCGGTATT CCCTGGGAAG AAGAAAAGGG CGGCGTAGGC GCTGACTTCC 

6241 TCAGCCTCGC AGTTGCTTGC GAAGAAGTAG CTAAAGTTAC CW3CCCGGGC CGTCG <S£Q 
ID NO:33) 
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Figure 23 

ATGAAACCAATGA<3ACTACAT€AGGTAGGCATTGT<:CTGG<:<3ACCTTAGAAAAAGGCCAT 

GAATTCATGCAGAATAATGGACTTGAAATCGACTATGCGGGCTATGTCGATGCTTACCAG 

GCTGATGTCATTTTCACTAAGTTTGGTGAATTTGCCAGCCCGATTGAAATGATTATCCCG 

CACTCCGGTGTGCTTACCCAATTCAATGGTGGGCGCGGCGGCATTGCCCACATCGGCTTC 

GAAGTGGACGATGTCGAAGCTGTCCGCCAGGAAATGGAAGCAGAT^ 

TTAGAAAAGAAAGCTGTCCAGGGTACGGACGACATTATCGTCAACTTCGGCCGCCCGACA 

ACCAACCAGGGTATCCTCGTTGAATATGTTCAGACGACAGCACCTATCAGeGGCGGCGGC 

GAAAATCCTTTCGTTAAGAATCTCGGCCCGGAAAAAGGGAAGCTCAACGAAACATGGCAT 

CCCATGCGCCTGCACCATATCGGCATCGTCTTGCCGACCTTGGAAAAGGCCCATGAATTC 

ATCAAGACCAATGGTCTGGAAGTGGATTATTGCGGTTTGGTCGACGCCTACCATGCGGAT 

CTCATTTTClCCTAAAAAAGGTGAAAACAGTACGCCTATCGAATTCATTATTCCCCGT<3AA 

GGGGTGCTCAAAGATTTCAATCATGGCAGGGGAGGTATCGCTCATATCGCCTTTGAAGTG 

GATGATGTCGAAAAGGTACGTCAGATTATGGAAAGCCAGAAGCCTGGTTGCATGCTGGAA 

AAGAAAGCCGTCCGGGGAACGGACGATATCATCGTCAACTTCCGCCGTCGCAGCACGGAC 

GCCGGCATCCTCGTCGAATATGTCCAGACCGTAGCTCCCATCAATCGCAGCAATCCCAAC 

CCTTTTAATGATTGA <SEQ ID NO: 34) 



BNSOOOD: <WO 024241 8A2J_> 
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Figure 24 

MKPMRLHHVGIVLPTLEKAHEFMQNNGLEIDYAGYVDAYQADLIFTKFGEFASPIEMIIP 
HSGVLTQFNGGRGGIAH I AFEVDDVEAVRQEMEADCPGCMLEKKAVQGTDDI IVNFRRPT 
TNQGILVEYVQTTAPITGRGENPFVKNLGPEKGKLNETWHPMRLHHIGIVLPTLEKAHEF 
1KTNGLEVDYSGFVDAYHADLIFTKKGENSTPIEFIIPREGVLKDFNHGRGGIAHIAFEV 
DDVEKVRQIMESQKPGCMLEKKAVRGTDDIIVNFRRPSTDAGILVEYVQTVAPINRSNPN 
PFND (SEQ ID NO: 35) 
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Figure 25 

ATGGAATTCAAACTTTCTGAATTACAGCAAGATATC^ 

GAAAAAAAATTAGCTGCCACTGTCAAAGAGCGTGACGAAAAAGAAGTTTTCGATCGTGCT 
ATGCTTGACGAAGTGGGTACTCTCGGCCTTCTGGGTATTCCCTGGGAAGAAGAAAACGGC 
GGCGTAGGCGCTGACTTCCTCAGCCTCGCAGTTGCTTGCGAA^AAGTAGCTAAAGTTACG 
AGCCCGGGGCGTCG <S£Q ID NO: 3*6)* 



BNSOOaD: <WO 02424 1BA2J_> 
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Figure 26 

MEFKLSELQQDIANLAKDFAEKKLAPTVKERDEKEVFDRAILDEVGTLGLLGIPWEEENG 
GVGADFLSLAVACEEVAKVTSPGR (SEQ ID NO: 37) 
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Figure 27 

1 GTGAGCACAC ACTTGATAGC TGATGCCGTC AATGATCAGT TGTTCGTCTA TAGCAGGCTG 

61 AAAGGACATG GGTTTGGTCA CAGTCTGAGC AGTTGCAGGC AGTCAAACAC GTTCGTAACT 

121 ACGCTGTAGA TGATATAAGC AGTATAGCAT CTTGCTACGC TCTCGTTGAT CAGGTTGAAT 

181 GCTTTGAGGA AGGTCAGGCG AATAGCCATG CCTCTTGTTT CCAGAACATG GCATGGGGAT 

241 GGATCGAGGG TACCCTGTCG GATGCATGCT ATCCGTGGCA TTCATATCAT CAACCAGAAT 

301 TTGATCTTGA ACTACACAGC AATTCTGCGC GTTATGCAAG TGTCTTGGGT CAGATGGTGA 

361 ACAATTCTCA ATTGTTGAGG TCTTGACGAA TTGCGTTATA CACTGTAGGC TATAGTATGC 

421 ACCCCTTCTT ATCTATATCA CAACCGGTCT ATTAGCATTT GCGTCAAGGA GGATGGTCGA 

481 TGATGGACAC TGCGGCCCTT GCOCCACCAC GGGGGGCCCG CTCTAATCCG ATTCGGGATC 

541 GAGTTGATTG GGAAGCTCAG CGCGCTGCTG CGCTGGCAGA TCCCGGTGCC TTTCATGGCG 

€01 CGATTGCCCG GACAGTTATC CACTGGTACG ACCCACAACA CCATTGCTGG ATTCGCTTCA 

€61 ACGAGTCTAG TCAGCGTTGG GAAGGGCTGG ATGCCGCTAC CGGTGCCCCT GTAACGGTAG 

721 ACTATCCCGC CGATTATCAG CCCTGGCAAC AGGCGTTTGA TGATAGTGAA GCGCCGTTTT 

781 ACCGCTGGTT TAGTGGTGGG TTGACAAATG CCTGCTTTAA TGAAGTAGAC <CGGCATGTCA 

841 TGATGGGCTA TGGCGACGAG GTGGCCTACT ACTTTGAAGG TGACCGCTGG GATAACTCGC 

901 TCAACAATGG TCGTGGTGGT CCGGTTGTOC AGGAGACAAT CACGCGGCGG CGCCTGTTGG 

961 TGGAGGTGGT GAAGGCTGCG CAGGTGTTGC GTGATCTGGG CCTGAAGAAG GGTGATCGGA 

1021 TTGCTCTGAA TATGCCGAAT ATTATGCCGC AGATTTATTA TACGGAAGCG GCAAAACGAC 

1081 TGGGTATTCT GTACACGGCG GTCTTCGGTG GCTTCTCGGA CAAGACTCTT TCCGACCGTA 

1141 TTCACAATGC CGGTGCACGA GTGGTGATTA CCTCTGATGG TGCGTACCGC AACGCGCAGG 

1201 TGGTGCCCTA CAAAGAAGCG TATACCGATC AGGCGCTCGA TAAGTATATT CCGGTTGAGA 

1261 CGGCGCAGGC GATTGTTGCG CAGAGCCTGG CCACCTTGCC CCTGACTGAG TCGCAGGGCC 

1321 AGACGATCAT CACCGAAGTG GAGGCCGCAC TGGCCGGTGA GATTACGGTT GAGCGCTCGG 

1381 ACGTGATGCG TGGGGTTGGT TCTGCCCTCG CAAAGCTCCG CGATCTTGAT GCAAGCGTGC 

1441 AGGCAAAGGT GCGTACAGTA CTGGCGCAGG CGCTGGTGGA GTCGCCGCCG CGGGTTGAAG 

1501 CTGTGGTGGT TGTGCGTCAT ACCGGTCAGG AGATTTTGTG GAACGAGGGG <X3AGATCGCT 

1561 GGAGTCACGA CTTGCTGGAT GCTGCGCTGG CGAAGATTCT GGCCAATGCG OGTGCTGCCG 

1621 GCTTTGATGT GCACAGTGAG AATGATCTGC TCAATCTCCC CGATGACCAG -CTTATCCGTG 

i€8i <xk:tctacgc cagtattccc tgtgaaccgg TTGATGCTGA atatccgatg tttatcattt 
1741 ACACATCGGG tagcaccggt aagcccaagg gtgtgatcca cgttcacggc ggttatgtcg 

1801 GCGGTGTGGT GCACACCTTG CGGGTCAGTT TTGACGCCGA GCCGGGTGAT ACGATATATG 

1861 TGATCGCCGA TCCGGGCTGG ATCACCGGTC AGAGCTATAT GCTCACAGCC ACAATGGCCG 

1921 GTCGGCTGAC CGGGGTGATT GCCGAGGGAT CACCGCTCTT CCCCTCAGCC GGGCGTTATG 

1981 CCAGCATCAT CGAGCGCTAT GGGGTGCAGA TCTTTAAGGC GGGTGTGACC TTCCTCAAGA 

2041 CAGTGATGTC CAATCCGCAG AATGTTGAAG ATGTGCGACT CTATGATATG CACTCGCTGC 

2101 GGGTTGCAAC -CTTCTGCGCC GAGCCGGTCA GTCCGGOGGT GCAGCAGTTT GGTATGCAGA 

2161 TCATGACCCC GCAGTATATC AATTCGTACT GGGCGACCGA GCACGGTGGA ATTGTCTGGA 

2221 CGCATTTCTA CGGTAATCAG GACTTCCCGC TTCGTCCCGA TGCCCATACC TATCCCTTGC 

2281 CCTGGGTGAT GGGTGATGTC TGGGTGGCCG AAACTGATGA GAGCGGGACG ACGCGCTATC 

2341 GGGTCGCTGA TTTCGATGAG AAGGGCGAGA TTGTGATTAC CGCCCCGTAT CCCTACCTGA 

2401 CCCGCACACT CTGGGGTGAT GTGCCCGGTT TCGAGGCGTA CCTGCGCGGT GAGATTCCGC 

2461 TGCGGGCCTG GAAGGGTGAT GCCGAGCGTT TCGTCAAGAC CTACTGGCGA CGTGGGCCAA 

2S21 ACGGTGAATG GGGCTATATC CAGGGTGATT TTGCCATCAA GTACCCCGAT GGTAGCTTCA 

2581 CGCTCCACGG ACGCCCTGAC GATGTGATCA ATGTGTCGGG CCACCGTATG GGCACCGAGG 

2641 AGATTGAGGG TGCCATTTTG CGTGACCGCC AGATCACGCC CGACTGGCCC GTCGGTAATT 

2701 GTATTGTGGT CGGTGCGCCG CACCGTGAGA AGGGTCTGAC OGCGGTTGCC TTCATTCAAC 

2761 CTGCGCCTGG CCGTCATCTG ACCGGCGGGG ACCGGCGCCG TCTCGATGAG CTGGTGOGTA 

2821 CCGAGAAGGG GGCGGTCAGT GTCCCAGAGG ATTACATCGA GGTCAGTGCC TTTCCCGAAA 

2881 CCCGCAGCGG GAAGTATATG CGGCGCTTTT TGCGCAATAT GATGCTCGAT GAACCACTGG 

2941 GTGATACGAC GACGTTGCGC AATCCTGAAG TGCTCGAAGA GATTCCAGCC AAGATGGCTG 

3001 AGTGGAAACG CCGTCAGCGT ATGGCCGAAG AGCAGCAGAT CATCGAACGC TATCGCTACT 

30€1 TCCGGATCGA GTATCACCCA CCAACGGCCA GTGCGGGTAA ACTCGCGGTA GTGACGGTGA 

3121 CAAATCCGCC GGTGAACGCA CTGAATGAGC GTGCGCTCGA TGAGTTGAAC ACAATTGTTG 

3181 ACCACCTGGC CCGTCGTCAG GATGTTGGGG CAATTGTCTT CACCGGACAG GGCGCCAGGA 

3241 GTTTTGTCGC CGGCGCTGAT ATTCGCCAGT TGCTCGAAGA GATTCATACG GTTGAAGAGG 

3301 CAATGGGGCT GCCGAATAAC GCCCATCTTG CTTTCCGCAA GATTGAGCGT ATCAATAAGC 

3361 CGTGTATCGC GGCGATCAAC GGTGTGGCGC TCGGTGGTGG TCTGGAATTC GGCATGGGCT 
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3421 GGCATTACCG -GGTTGOCGAT GTCTATGGCG AATTCGGTCA GCCAGAGATT AATCTGGGCT 
3481 TGCTAGCTGG TTATGGTGGC ACGCAGCGCT TCGCGGGCCT GTTGTACAAG GGCAACAACG 
3541 GCACCGGTCT GCTCCGAGGG CTGGAGATGA TTCTGGGTGG GCGTAGCGTA GCGGCTGATG 
3601 AGGCGCTGAA GCTGGGTCTG ATCGATGCCA TTGCTACGGG CGATCAGGAC TCACTGTCGC 
3661 TGGCATGCGC GTTAGCCCGT GCCGCAATCG GGGCCGATGG TCAGTTGATC GAGTCGGCTG 
3721 CGGTGACCCA GGCTTTCCGC CATCGCCACG AGCAGCTTGA CGAGTGGCGC AAACCAGACC 
3781 CGCGCTTTGC CGATGACGAA CTGCGCTGGA TTATCGCCCA TCCACGTATC GAGGGGATTA 
3841 TCCGGCAGGC GCATACCGTT GGGCGCGATG CGGCAGTGCA TCGGGCACTG GATGCAATCC 
3901 GCTATGGCAT TATCCAGGGC TTCGAGGCOG GTCTGGAGCA CGAGGCGAAG CTCTTTGCCG 
3961 AGGCAGTGGT TGACCCGAAC GGTGGCAAGC GTGGTATTCG CGAGTTCCTC GAGCGCCAGA 
4021 GTGCGCCGTT GGCAACCCGC CGAGCATTGA TTACACCTGA ACAGGAGCAA CTCTTGCGCG 
4081 ATCAGAAAGA ACTGTTGCCG GTTGGTTCAC CCTTCTTOCC CGGTGTTGAC CGGATTCCGA 
4141 AGTGGCAGTA GGCGCAGGCG GTTATTCGTG ATCCGGACAC CGGTGCGGCG -GCTCAGGGCG 
4201 ATCCCATCGT GGCTGAAAAG CAGATTATTG TGCCGGTGGA ACGCCCCCGC GCCAATCAGG 
4261 CGCTGATCTA TGTTCTGGCC TCGGAGGTGA ACTTCAACGA TATCTGGGCG ATTACCGGTA 
4321 TTCCGGTGTC ACGGTTTGAT GAGCACGACC GCGACTGGCA CGTTACCGGT TCAGGTGGCA 
4381 TCGGCCTGAT CGTTGCGCTG GGTGAAGAGG CGCGACGCGA AGGCCGGCTG AAGGTGGGTG 
4441 ATCTGGTGGC GATCTACTCC GGGCAGTCGG ATCTGCTCTC ACCGCTGATG GGCCTTGATC 
4S01 CGATGGCCGC -CGATTTCGTC ATCCAGGGGA ACGACACGGC AGATGGATCG CATCAGCAAT 
4561 TTATGCTGGC CCAGGCCCCG CAGTGTCTGC CCATCCCAAC CGATATGTCT ATCGAGGCAG 
4621 CCGGGAGCTA CATCCTCAAT CTCGGTACGA TCTATCGCGC CCTCTTTAGG ACGTTGCAAA 
4681 TCAAGGCCGG ACGCACCATC TTTATCGAGG GTGCGGCGAC CGGTACCGGT CTGGACGCAG 
4741 CGCGCTCGGC GGCCCGGAAT GGTCTGCGCG TAATTGGAAT GGTCAGTTCG TCGTCACGTG 
4801 CGTCTACGCT GCTGGCTGCG GGTGCCCACG GTGCGATTAA CCGTAAAGAC CCGGAGGTTG 
4861 CCGATTGTTT CACGCGCGTG CCCGAAGATC CATCAGCCTG GGCAGCCTGG GAAGCCGCCG 
4921 GTCAGCCGTT GCTGGGGATG TTCCGGGOGC AGAACGACGG GCGACTGGCC GATTATGTGG 
4981 TCTCGCACGC GGGCGAGAGG GCCTTOCCGC GCAGTTTCCA GCTTCTCGGC GAGCCACGCG 
5041 ATGGTCACAT TCCGACGCTC ACATTCTACG GTGCCACCAG TGGCTACCAC TTCACCTTCC 
5101 TGGGTAAGCC AGGGTCAGCT TCGCCGACCG AGATGCTGOG GCGGGCCAAT CTCCGCGCCG 
5161 GTGAGGCGGT GTTGATCTAC TACGGGGTTG GGAGCGATGA CCTGGTAGAT ACCGGCGGTC 
5221 TGGAGGCTAT CGAGGCGGCG CGGCAAATGG GAGCGCGGAT CGTCGTCGTT ACCGTCAGCG 
5281 ATGCGCAAGG CGAGTTTGTC CTCTCGTTGG GCTTCGGGGC TGCCCTACGT GGTGTCGTCA 
5341 GCCTGGCGGA ACTCAAACGG CGCTTCGGCG ATGAGTTTGA GTGGCCGCGC ACGATGCGGC 
5401 CGTTGCCGAA CGCCCGCCAG GACCCGCAGG GTCTGAAAGA GGCTGTCCGC CGCTTCAACG 
5461 ATCTGGTCTT CAAGCCGCTA GGAAGCGCGG TCGGTGTCTT CTTGCGGAGT GCCGACAATC 
5521 CGCGTGGCTA CCCCGATCTG ATCATCGAGC GGGCTGCCCA CGATGCACTG GCGGTGAGCG 
5581 CGATGCTGAT CAAGCCCTTC ACCGGACGGA TTGTCTACTT CGAGGACATT GGTGGGCGGC 
5641 GTTACTCCTT CTTCGCACCG CAAATCTGGG TGCGCCAGCG CCGCATCTAC ATGCCGACGG 
5701 CACAGATCTT TGGTACGCAC CTCTCAAATG CGTATGAAAT TCTGCGTCTG AATGATGAGA 
5761 TCAGCGCCGG TCTGCTGACG ATTACCGAGC CGGCAGTGGT GCCGTGGGAT GAACTACCCG 
5821 AAGCACATCA GGCGATGTGG GAAAATCGCC ACACGGCGGC CACTTATGTG GTGAATCATG 
5881 CCTTACCACG TCTCGGCCTA AAGAAGAGGG ACGAGCTGTA CGAGGCGTGG ACGGCCGGCG 
5941 AGCGGTAGCG CGGATGGGTA TTGAACAGGT AAGGGACGGA AGATCGAACC TTCCGTCCGT 
6001 TATCTTTTGG CCGTCGAAGC GTGCTGAGCC GATTATGGTT GCCGTGGTTG TCCCGATGGG 
6061 C AG ACGCGCT CGAACCAGAT GATACCftCCG ACGGCTATCG TCAOCAAACC GGCGAAGACC 
6121 AGGTAAGCCT CTGAAGGACG C (SEQ ID NO: 38) 
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Figure 28 

1 MIDTAPLAPP RAPRSNPIRD RVDWEAQRAA ALADPGAFHG AIARTVIHWY DPQHHCWIRF 

61 NESSQRWEGL DAATGAPVTV DYPADYQPWQ GAFDDSEAPF YRWFSGGLTN ACITflEVDRHV 

121 MMGYGDEVAY YFEGDRWDNS LNNGRGGPW QETITRRRLL VEWKAAQVL RDLGLKKGDR 

181 IALNMPNIMP OIYYTEAAKR LGILYTPVFG GFSDKTLSDR IHNAGARWI TSDGAYRNAQ 

241 WPYKEAYTD QALDKYIPVE TAQAIVAQTL ATLPLTESQR QTIITEVEAA LAGEITVERS 

301 DVMRGVGSAL AKLRDLDASV QAKVRTVIAQ ALVESPPRVE AVWVRHTGQ EILWNEGRDR 

361 WSHDLLDAAL AKILANARAA -GFDVHSENDL LNLPDDQLIR ALYASIPCEP VDAEYPMFII 

421 YTSGSTGKPK GVIHVHGGYV AGWHTLRVS FDAEPGDTIY VIADPGfllTG QSYMLTATMA 

481 GRLTGVIAEG SPLFPSAGRY ASIIERYGVQ IFKAGVTFLK TVMSNPQNVE DVRLYDMHSL 

541 RVATFCAEPV SPAVQQFGMQ IMTPQYINSY WATEHGGIVW THFYGNQDFP LRPDAHTYPL 

601 PWVMGDVWVA ETDESGTTRY RVADFDEKGE IVITAPYPYL TRTLWGDVPG FEAYLRGEIP 

661 LRAWKGDAER FVKTYWRRGP NGEWGYIQGD FAIKYPDGSF TLHGRPDDVI NVSGHRMGTE 

721 EIEGAILRDR QITPDSPVGN CIWGAPHRE HGLTPVAFIQ PAPGRHLTGA DRRRLDELVR 

781 TEKGAVSVPE DYIEVSAFPE TRSGKYMRRF LRNMMLDEPL GDTTTLRNPE VLEEIAAKIA 

841 EWKRRQRMAE EQQIIERYRY FRIEYHPPTA 5AGKLAWTV TNPPVNALNE RALDELNTIV 

901 DHIARRQDVA AIVFTGQGAR SFVAGADIRO LLEEIHTVEE AMALPNNAHL AFRKIERMNK 

961 PCIAAINGVA LGGGLEFAMA CHYRVADVYA EFGQPEINLR LLPGYGGTQR LPRLLYKRNN 

1021 GTGLLRALEM ILGGRSVPAD EALKLGLIDA IATGDQDSLS LACALARAAI GADGQLIESA 

1081 AVTQAFRHRH EQLDEWRKPD PRFADDELRS IIAHPRIERI IRQAHTVGRD AAVHRALDAI 

1141 RYGIIHGFEA GLEHEAKLFA EAWDPNGGK RGIREFLDRQ SAPLPTRRPL ITPEQEQLLR 

1201 DQKELLPVGS PFFPGVDRIP KWQYAQAVIR DPOTCAAARG DPIVAEKQII VPVERPRANQ 

1261 ALIYVLASEV NFNDIWAITG IPVSRFDEHD RDWHVTGSGG IGLIVALGEE ARREGRLKVG 

1321 DLVAIYSGQS DLLSPLMGLD PMAADFVIQG N0TPDGSHQQ FMLAQAPQCL PIPTDMSIEA 

1381 AGSYILNLGT IYRALFTTLQ IKAGRTIFIE GAATGTGLDA ARSAARNGLR VIGMVSSSSR 

1441 ASTLLAAGAH GAINRKDPEV ADCFTRVPED PSAWAAWEAA GQPLLAMFRA QNDGRLADYV 

1S01 VSHAGETAFP RSFQLLGEPR DGHIPTLTFY GATSGYHFTF LGKPGSASPT EMLRRANLRA 

1561 GEAVLIYYGV GSDDLVDTGG LEAIEAARQM GARIWVTVS DAQREFVLSL "GFGAALRGW 

1621 SLAELKRRFG DEFEWPRTMP PLPNARQDPQ -GLKEAVRRFN DLVFKPLGSA VGVFLRSADN 

1681 PRGYPDLIIE RAAHDALAVS AMLIKPFTGR IVYFEDIOGR RYSFFAPQIW VRQRRIYMPT 

1741 AQIFGTHLSN AYEILRLNDE ISAGLLTITE PAWPWDELP EAHQAMWENfc HTAATYWNH 

1801 ALPRLGLKNR DELYEAWTAG ER (SEQ ID NO: 39) 
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Figure 29 

ATGAGTGAAGAGTCTCTGGTTCTCAGCACAATTGAAGGC<:GGATC 
AATCGCCCCCAGGCCCTCAATGGGCTCAGTCCGGCCTTGATTGATGACCTCATTCGCCAT 
TTAGAAGCCTGCGATGCCGATGACACAATCCGG<5TGATCATTATCACCGGC<3CG<3GACGG 
GCATTTGCTGCCGGCGCTGATATCAAAGCGATGGCCAATGCCACGCCTATTGATATGCTC 
ACCAGTGGCATGATTGCGCGCTGGGCACGCATCGCCGCGGTGCGCAAACCGGTGATTGCT 
GCC-GTGAATGGGTATGCGCTCGGTGGTGGTTGTGAATTGGCAATGATGTGCGACATCATC 
ATCGCCAGTGAAAACGGGCAGTTCGGACAACCGGAAATCAATCTGGGCATCATTCCC<M 
GCTGGTGGCACCCAACGGCTGACCCGCGCCCTTGGCCCGTATCGCGCAATGGAATTGATC 
CTGACGGGCG<^ACCATCAGTGCTCAGGAAGCTCTCGCCCAC<^CTGGTGTGCCGGGTC 
TGCeCGCCTGATUVGCCTGCTCGATGAAGCCCGTGGGATCGCGCAAACCATTGCCACCAAA 
TCACCACTGGCTGTACAGTTGGCGAAAGAGGCAGTCCGTATGGCCGCCGAAAGCACTGTG 
CGCGAGGGGTTGGCTATCGAGCTGCGTAACTTGTATCTGCTGTTTGCGAGTGCTGACCAA 
AAAGAGGGGATGCAGGCATTTATGGAGAAACGCGCTCCCAACTTCAGTGGTGGTTGA 
{SEQ ID NO: 40) 
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Figure 30 



MSEESLVLSTIEGPIAILTLNRPQALNALSPALIDDLIRHLEACDADDTIRVIIITGAGR 
AFAAGADIKAMANATPIDMLTSGMIARWARIAAVRK^ 

IASENAQFGQPEINLGIIPGAGGTQRLTRALGPYRAMELILTGATISAQEALAHGLVCRV 
CPPESLLDEARRIAQTIATKSPLAVQIJ^AVRMAAETTVMiGLAIELRNFYLLFASADQ 
KEGMQAFXEKRAPNFSGR (SEQ ID NO:41) 
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Figure 31 

GGCGTAATC<^AC€GGCAGGTTAGGGTCTTCTACTGGGGTCAAGGCGC€TCTCCTTTTGG 
TGGCGGGAGCAACCGGGCTTTTCCTGGCTTCAATGTACCATAGAGCGGTTACTTGGTGCA 
ACGGGCGTGGTACAATCGAGAGCAACCTTT<X^AAAAt^TATCCAATCCTGCACACGTGC 
ATCTGTTACAGGGTATTATTGTCGGCAAAGGACAGTCCTGTGGTTTATGTACAA^GAGAT 

caacgtatgagtgaagagtctctggttctcagcacaattgaaggccccatgg6catcctc 

acgctcaatcgcc<x^caggccctcaatgcgctcagtccggccttgattgatgacctcatt 

<^cc^tttagaagcctgcgatgccgatgac^caatcck^gtgatc 

gga<xsggcatttgctgccggggctgatatcaaagcgatggccaatgccacgcctattgat 

atgct<^ccagtggcatgattgcgcgctgggcacgc^vtcgccgcx;gtgcgcaaaccggtg 

ATTGCTGCCGTGAATGGGTATGCGCTCGGTGGTGGTTGTGAATTGGCAATGATGTGCGAC 
ATCATCATCGCCAGTGAAAACGCGCAGTTCGGACAACCGGAAATC7VATCTGGGCATCATT 
CGCGGTGCTGGTGGCA€CCAACGGCTGACGCGCKX;CCTTGGCGCGTATC<^G€7VATGGAA 
TTGATCCTGACCGGCGCGACCATCAGTGCTCAGGAAGCTCTCGCCCACGGCCTGGTGTGC 
C<K3GTCTGCCG<3CCTGAAAGCCTGCTCGATGAAGCCC<3TGGGATCGCGCAAACCATTGCC 
ACCAAATCACCACTGGCTGTACAGTTGGCGAAAGAGGGAGTCCGTATGGCCGCCGAAACC 
ACTGTGCGCGAGGGGTTGGCTATCGAGCTGCGTAAGTTCTATGTGCTGTTTGCCAGTGCT 
GACCAAAAAGAGGGGATGCAGGCATTTATCGAGAAAC<3GGCTCCCAACTTCAGTGGTGGT 
TGATCACGCGCAGAACATGGCAGCAGGGGCAATACCTGCACGTACTGCCTCCTGCCGCCA 
TACTACGAGATGATCGAGCAGTAAAGGGTAAATAGTCTATCAATCTGGCCAGATAA<5CGG 
TTGGGTAACAAGGCAATGCTCCAAAGGAGACGATGATGGACATACACGAGCGATTGCGAT 
CTGTGGAACGCGAAAATGCT {SEQ ID NO:42T 
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Figure 32 



SEQ ID NO: 40 1 — atgagtga agagt— 

SEQ ID NO: 43 1 atgacgta-- cgaaa- 



SEQ ID NO: 44 1 atggccgccctgcgtgt cctgctgtcctgcgcccgcggcc 

SEQ ID NO: 45 1 atggcggccctgcgtgctctgctgcccagagc 

SEQ ID NO: 40 14 ct -ctg gttctc-agcacaattgaa 

SEQ ID NO: 43 14 cc ate ctggtcgagcgc gat 

SEQ ID NO: 44 41 cgctgaggccc ccg gttcgc-tgtcccgcctgg 

ID NO: 45 33 ctgcaactcgctgttgtccccagttcgc-tgcccagaattc 



SEQ ID NO: 40 37 ggccccatcgcc - — atcctcacc 

SEQ ID NO: 43 34 cagcgagttggc ■ — attatcacg 

SEQ ID NO: 44 73 cgtcccttcgcctcgggtgctaactttgagtacatcatcgcagaaaaaag 

ID NO: 45 73 cggcgcttcgcctcgggtgctaactttcagtacatcatcacg 



SEQ ID NO: 40 58 c * 

SEQ ID NO: 43 55 c 

SEQ ID NO: 44 123 agggaagaataacaccgtggggttgatccaac i — 

SEQ ID NO: 45 US gaaaagaaaggaaagaata 



ID NO: 40 59" tcaatcgcccccaggccctcaatgcgctc 

SEQ ID NO: 43 56 —tgaaccgtccccaggcactgaacgegctc 

SEQ ID NO: 44 155 tgaaccgccccaaggccctcaatgcactt 

SEQ ID NO: 45 134 gcagcgtggggctgatccagttgaaccgtcccaaagcactcaatgcactt 

SEQ ID NO: 40 88 agtccggccttgattgatgacctcattc — gccatttagaagcctgcgat 

SEQ ID NO: 43 85 a — acagecagg — tgatgaacgaggtc — acca — gcgctgcaaccgaa 

SEQ ID NO: 44 184 tgcgatggectgattgacgagctcaaccaggccctgaaga — tcttcgag 

SEQ ID NO: 45 184 tgcaatggactgattgaggagctcaacc — aagcactggagacctttgag 

SEQ ID NO: 40 136 gccgatgacaca atccgcgtgatcattatcaccggcgccggacg 

SEQ ID NO: 43 127 ctggacgatgacccggacattggggcgatcatcatcaccggttcggccaa 

SEQ ID NO: 44 232 gaggacccggcc gttgggggcattgtcctcaccggcggggataa 

SEQ ID NO: 45 232 gaagatcccgct gtgggcgccattgtgctcactggtggggagaa 

SEQ ID NO: 40 180 ggcatttgctgccggcgctgatatcaaagcgatggccaa tgee 

SEQ ID NO: 43 177 agcgtttgccgccggagccgacatcaaagaaatggccga cctg 

SEQ ID NO: 44 276 ggcctttgcagctggagctgatatcaaggaaatgcagaacctgagtttcc 

SEQ ID NO: 45 276 ggcctttgcagccggagctgacatcaaggaaatgcagaa r-cegg 

SEQ ID NO: 40 223 acgcctattgatatgctcaccagtggcatgattgcgcgc tgggcacg 

SEQ ID NO: 43 220 acgttcgccgacgcgttcaccgccgacttcttcgccacc tggggcaa 

SEQ ID NO: 44 326 aggactgtt— — - — actccagcaagttcttgaagcac — -tggggeca 

SEQ ID NO: 45 319 acatttcagga-ctgttactca—ggcaagttcctgagccactgggacca 

SEQ ID NO: 40 270 catc^cgcggtgcgcaaaccggtgattgctgccgtgaatgggtatgcgc 

SEQ ID NO: 43 267 gctggccgccgtgcgcaccccgacgatcgccgcggtggcgggatacgcgc 

SEQ ID NO: 44 366 cctcacccaggtcaagaagccagtcatcgctgctgtcaatggctatccgt 

SEQ I-D NO: 45 366 tatcacccggatcaagaaaccggtcatcgcggctgtcaatggctatgctc 

SEQ ID NO: 40 320 tcggtggtggttgtgaattggcaatgatgtgcgacatcatcatcgccagt 

SEQ ID NO: 43 317 tcggcggtggctgcgagctggcgatgatgtgcgacgtgctgatcgccgcc 
SEQ ID NO: 44 ' 416 ttggcgggggctgtgagcttgccatgatgtgtgatatcatctatgccggt 

SEQ ID NO: 45 41-6 ttggtgggggctgtgaacttgccatgatgtgcgatatcatctatgctggt 
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370 gaaaacgcgcagttcggacaaccggaaatcaatctgggcatcattcccgg 
367 gacaccgcgaagttcggacagcccgagataaagctgggcgtgctgccagg 
4*6 gagaaggcccagtttgcacagccggagatcttaataggaaccatcccagg 

466 gagaaagcccagtttggacagccagaaatcctcctggggaccatcccagg 

420 tgc t ggt ggcacccaacggc tgacccgcgccctt ggcccgt at cgcgcaa 
417 ca tgggcggct<:ccagcggctgacccgggctatcggcaaggctaaggcga 
516 tgcaggcggcacccagagactcacccgtgctgttgggaagtcgctggagc 
5 16 t gcagggggcac tcagagact cacccgagcagtcggcaaa t cact agcaa 

470 tggaattgatcctgaccggcgcgaccatcagtgctcaggaagctctcgcc 

467 tggacctcatcctgaccgggcgcaecatggacgccgccgaggc-cgagcg 
566 tggagatggtcctcaccggtgacgcgatctcagcccaggacgc-caagca 
566 tggagatggtccfccactggtgaccgaatttcagcacaggatgc-caagca 

52 0 ca-c-ggcctggtgtgccgggtct gcccgcct gaaagcctgctcgatgaa 
516 cagc-ggtctggtttcacgggtggtgccggccgacgacttgctgaccgaa 
615 ag-caggtcttgtcagcaagatttgtcctgttgagacactggtggaagaa 
615 ag-caggtcttgtaagcaagatttttcccgttgaaacactggttgaagag 

568 gcccgtcggatcgcgcaaaccattgccaccaaatcaccactggctgtaca 
565 gccagggccactgccacgaecatttcgcagatgtcggcctcggcggcccg 
664. gccatccagtgtgcagaaaaaattgccagcaattctaaaattgtagtagc 

664 gccatccaatgtgcagaaaagatcgccaacaattccaagatcatagtagc 

618 gttggcgaaagaggcagtccgtatggccgccgaaaccactgtgcgcgagg 

615 gatggccaaggaggccgtcaaccgggctttcgaatccagtttgtccgagg 

714 gatggccaaagaatcagtgaatgcagcttttgaaatgacattaacagaag 

714 ca t ggcgaaagaatctgtgaat gcagcctttgaaatgacgttaacagaag 

668 ggttggctatcgagctgcgtaacttctatctgctgtttgccagtgctgac 

665 ggctgctctacgaacgccggcttttccattcggctttcgcgaccgaagac 
7 64 gaagtaagttggagaagaaactcttttattcaacctttgccactgatgac 

764 gaaa taagct ggagaagaagct c ttctattccacctt tgccac tga tgac 

718 caaaaagaggggatgcaggcat ttat cgagaaacgcgctcccaacttcag 

715 caatccgaaggtatggcagcgttcatcgagaaacgcgctccccagttcac 
814 cggaaagaagggatgaccgcgtttgtggaaaagagaaaggccaacttcaa 
814 cggagagaagggatgtctgcctttgtggagaaaaggaaggccaacttcaa 

768 tggtcgttga 

765 ccaccgatga 
864 agaccagtga 
864 agaccactga 
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Figure 33 



SEQ ID NO: 41 1 -mseeslv lstiegp ■ 

SEQ ID NO: 46 1 -mtyetil ver-dqr~ 

SEQ ID NO: 47 1 -maalrvl lscargplrppvrcpawrpfasganfeyiiaekrg 

SEQ ID NO: 48 1 maalrallpracnsllspvrcpefrrfasganfqyiitekkgknss — — 

SEQ ID NO: 41 15 iailtlnrpqalnalspaliddlirhleacdaddtirviiitgagr 

SEQ ID NO:4€ 14 vgiitlnrpqalnalnsqvmnevtsaateldddpdigaiiitgsaJc 

SEQ ID NO: 47 43 krmtvgliqlnrpkalnalcdglidelnqalkif eedpavgaivltggdk 

SEQ ID NO: 48 47 vgliqlnrpkalnalcnglieelnqaletf eedpavgaivltggek 

SEQ ID NO: 41 €1 afaagadikamanatpidmltsgmiarwariaavrkpviaavngyalggg 

SEQ ID NO: 46 €0 af aagadikemadltf adaf tadf f atwgklaavrtptiaavagyalggg 

SEQ ID NO: 47 93 afaagadikemqnlsfqdcysskflkhwdhltqvkkpviaavngyaf ggg 

SEQ ID NO: 48 93 af aagadikemqnrtfqdcysgkflshwdhitrikkpviaavngyalggg 

SEQ ID NO: 41 111 celanmicdiiiasenaqfgqpeinlgiipgaggtqrltralgpyraiaeli 

SEQ ID NO: 46 110 celammcdvliaadtakfgqpeiklgvlpgmggsqrltraigkakaiodli 

SEQ ID NO: 47 143 celammcdiiyagekaqf aqpeiligtipgaggtqrltravgkslamemv 

SEQ ID NO: 48 143 celairancdiiyagekaqfgqpeillgtipgaggtqrltravgkslamemv 

SEQ ID NO: 41 161 ltgatisaqealahglvcrvcppeslldearriaqtiatksplavqlake 

SEQ ID NO: 46 160 Itgrtmdaaeaersglvsrwpaddlltearatattisqmsasaarmake 

SEQ ID NO: 47 193 ltgdrisaqdakqaglvsicicpvetlveeaiqcaekiasnskiwamake 

SEQ ID NO: 48 193 Itgdrisaqdakqaglvskifpvetlveeaiqcaekiannskilvamake 

SEQ ID NO: 41 211 avrmaaettvreglaielrnfyllfasadqkegmqaf iekrapnf sgr 

SEQ ID NO: 46 210 avnraf esslsegllyerrlfhsaf atedqsegmaaf iekrapqf thr 

SEQ ID NO: 47 243 svnaafemtltegsklekklf ystf atddrkegmtafvekrkanf kdq 

SEQ ID NO: 48 243 svnaaf emtltegnklekklf ystf atddrregmsafvekrkanf kdh 
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Figure 39 

ATGATCGAGACTGCGC<X:CTTCXX^^ 

C<3AGTTGATTGGGAA<3CTCAG€<3CGCTGCTjGCGCTGGCAGATCCCGGTGCCTTTCATGGC 
GCGATTGCCCGGACAGTTATCCACTGGTACGACCCACAACACCATTGCTGGATTCGCTTC 
AACGAGTCTAGTCAGCX3TTGGGAAGGGCTGGATGCCGCTA€CGGTGCCG<:TGTAACGGTA 

gactatccggccgattatcagccctggcaacaggcgtttgatgatagtgaagcggcgttt 
taccgctggtttagtggtgggttgacaaatgcctgctttaatgaagtagaccggcatgtc 
atgatgggctatggcgacgaggtggcctactactttgaaggtgaccgctgggataactgg 
ctcaacaatggtc<3tggtggtccggttgtccaggagacaatcacgggggggggcctgttg 
gtggaggtggtgaaggctgggcaggtgttgcgtgatctggggctgaagaagggtgatcgg 
attgctctgaatatgcggaatattatgcc<^agatttattatacggaagcggcaaaacga 
<:tgggtattgtgtacaggccggtcttgggtggcttgtcggacaagactctttccgaccgt 
attcacaat(^cggtgcacgagtggtgattacctctgatggtgcgtacggcaaggcgcag 
gtggtggcctagaaagaagcgtataccgatcaggggctggataagtatattccggttgag 
acggcgcaggcgattgttgcgcagagcctggccaccttgcccctgactgagtcgcagcgc 
cagacgatcatcaccgaagtggaggccgcactggcasgtgagattacggttgagcgctcg 
gacgtgatgcgt^ggttggttctgctct^^ 

caggcaaaggikkigtacagtactggcgcaggcgctggtcgagtcggcgccgcgggttgaa 

gctgtggtggttgtgcgtcataccggtcaggagattttgtggaacgaggggcgagatcgc 

tggagtcacgacttgctggatgctgcgctggcgaagattctggccaatgcgcgtgctgcc 

ggctttgatgtgcacagtgagaatgatctgctcaatctccccgatgaccagcttatccgt 

gc^ctctacgccagtattccctgtgaaccggttgaik^tgaatatgcgatgtttatcatt 

tacacatcgggtagcaccggtaagcccaagggtgtgatccacgttcacggcggttatgtc 

gccggtgtggtgcacaccttgcgggtcagttttgacgccgagccgggtgatacgatatat 

gtgatcgccgatccgggctggatcacgggtcagagctatatgctcacagccacaatggcc 

ggtcggctgaccggggtgattgccgagggatcaccgctcttcccctcagcggggggttat 

gccagcatcatggagcgctatggggtgcagatctttaaggcgggtgtgaccttcctcaag 

acagtgatgtccaatcggcagaatgttgaagatgtgggactctatgatatgcactcgctg 

cgggttgcaaccttctgcgccgagccggtcagtggggcggtgcagcagtttggtatgcag 

atgatgaccccgcagtatatcaattcgtactgggcgaccgagcacggtggaattgtctgg 

aggcatttctacggtaatcaggagttcccgcttcgtcccgatgcccatacctatcccttg 

ccgtgggtgatgggtgatgtctgggtggccgaaactgatgagagcgggacgacgcgctat 

cgggtcgctgatttcgatgagaagggcgagattgtgattaoggcccggtatccctacctg 

aggcgcacactctggggtgatgtgcgcggtttcgaggcgtacgtgg<3gggtgagattgcg 

ctgcgggcctggaagggtgatgccgagcgtttggtcaagacctactggcgacgtgggcca 

t^agggtgaatggggctatatccagggtgattttgcgatcaagtaccccgatggtagcttc 

acgctccagggacgccgtgaggatgtgatct^atgtgtcgggccaccgtatgggcaccgag 

gagattgagggtgccattttgggtgaccgggagatcacgcccgactcgggggtcggtaat 

tgtattgtggtcggtgcgccgcaccgtgagaagggtctgagcccggttggcttcattcaa 

cctgcgcctcgccgtcatctc^^ 

accgagaagggggcggtcagtgtcccagaggattacatcgaggtcagtggctttcccgaa 
aggggcagggggaagtatatgcggcgctttttgcgcaatatgatgctggatgaaccactg 
ggtgatacgacgacgttgcgcaatcctgaagtgctggaagagattgcaggcaagatcgct 

<3AGTGGAAACGCCGTCAGCGTATGGCGGAAGAGCAGCAGATCATCGAAGGCTATCGCTAC 
TTCGGGATCGAGTATCACGCACGAACGGCCAGTGGGGGTAAACTCGCGGTAGTGACGGTG 
ACAAATGCGGGGGTGAACGCACTGAATGAGCGTGCGCTCGATGAGTTGAACACAATTGTT 
GACCACCTGGCCCGTCGTCAGGATGTTGCOGCAATT<3TC 

AGTTTTGTCGGGGGCGCTGATATTCGCCAGTTGCTCGAAGAGATTCATACGGTTGAAGAG 
GCAATGGCCCTGCCG7UVTAACGCGCATCTTGCTTTCCGCAAGATTGAGGGTATGAATAAG 
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<:GGTGTATCG03GGGATCAAC^TCTGGC<^ 

TGCCATTACCGGGTTGCGGAT-GTCTATGCCGAATTCGGTCAGCCAGAGATTAATCTGCGC 

TTGCTACCTGGTTATGGTGGCACGCAGC<3CTTGCCGCGCCTGTTGTACAAGGGCAACAAC 

GGCACCGGTCTGCTCCGAGCGCTGGAGATGATTCTGGGTGGGCGTAGCGTACCGQCTGAT 

<3AGGCGCTGAAGCTGGGTCTGATGGATGCCATf<3CTACCGGCGATCAGGACTCACTGTGG 

CTGGCATGCGCGTTAGCCCGTGCCGCAATCGGCGCCGATGGTCACT 

GC€GTGACCCAGGCTTTCCGCCAT€GCCACGAGCAGCTTGACGAGTGGCGCAAACCAGAC 

CC<K;GCTTTGC€GATGACGAACTGCGCTCGATTATCGCCCATCCACGTATGGAGCG<3ATT 

ATCGGGCAGGCCCATACCGTTGGGCGCGATGGGGCAGTGCATCGGGCACTGGATGCAATC 

CGCTATGGCATTATCCACGGCTTCGAGGGCGGTCTGGAGCACGAGGCGAAGCTCTTTCGC 

GAGGCAGTGGTTGAGCGGA7^CGGTGGCAAGCGTGGTATTGGCGAGTTCCTCGACC<3GCAG 

AGTGGGCGGTTGCCAAGCCGCCGACCATTGATTACACCTGAACAGGAGCAACTCTTGCGC 

GATCAGAAAGAACTGTTGCCGGTTGGTTCAGCCTTCTTCCCCGGTGTTGAGCGGATTCCG 

AAGTGGCAGTA<^CGCAGGCGGTTATTCGTGATG^GACAGGGGTGCGGC<3GCTCA<^C 

GATGCGATGGTGGCTGAAAAGCAGATTATTGTGCGGGTGGAAGGCCCCCGCGCCAATCAG 

GGGCTGATCTATGTTCTGGCCTCGGAGGTGAACTTCAACGATATCTGGGCGATTACCGGT 

ATTCCGGTGTCAC<^TTTGATGAGCAGGA^CGCGACTGGCA<!X3TTACCGGTTCAG 

ATGGGCGTGATCGTTGCGCTGGGTGAAGAGGGGCGACGCGAAGGCCGGCTGAAGGTGGGT 

GATGTGGTGGGGATCTACTGGGGGCAGTCGGATCTGCTCTCAGCGCTGATGGGCCTTGAT 

GCGATGGCGGCCGATTtCGTCATGCAGGGGAACGACACGCCAGATGGATCGCATCAGCAA 

TTTATCCTCGCCCAGGCCCCGCAGTGTCTGCGCATGCCAACCGATATGTCTATCGAGGCA 

GCCGGCAGCTAGATCCTCAATCTCGGTACGATCTATCGCGCCCTCTTTACGACGTTGCAA 

ATCAAGGCCGGACGCACCATCTTTATCGAGGGTGCG^GAGGGGTACCGGTCTGGACGCA 

GCGCGCTCGGGGGGCGGGAATGGTCTGCGGGTAATTGGAATGGTCAGTTCGTCGTCACGT 

GCGTCTAGGCTGCTGGCTGC^GGTGC^CACGGTCGGATTAAGGGTAAAGACCGGGAGGTT 

<KX^GATTGTTTCACGGGGGT<3CGCGAAGATCCATCAGCCTGGGCAGCCTGGGAAGGCGCG 

GGTGAGCCGTTGCTGGCGATGTTGC<3GGGGCAGAAGGACGGGCGACTGGCC<3ATTATGTG 

GTCTCGCAGGG<^GGGAGAGGGCCTTGGCGCGCAGTTTCCAGCTTGTCGGCGAGCCACGC 

GATGGTCAGATTGCGACGCTGACATTCTACGGTGCCACCAGTGGCTAGGACTTCACCTTC 

CTGGGTAA^CA<^GTGAGCTTC^3CCGACCGAGAT<^TGCGGC<MGCCAATCTCCGGGGG 

GGTGAGGCGGTGTTGATGTAGTACGGGGTTGGGAGCGATGACCTGGTAGATAGCGGCGGT 

CTGGAGGCTAT<:GA<3GCGGCGC<3GCAAAT<^AGCGCGGATCGTCGTCGTTACGGTCAGG 

gatccgcaaggcgagtttgtgctgtcgttgggcttgggggctgccctaggtggtgtcgtg 
agcgtggcggaactcaaacggggcttcggcgatgagtttgagtggccgcgcacgatggcg 
ccgttggcgaacgccggcgaggacccgcagggtctgaaagaggctgtccggcgcttcaag 

GATGTGGTCTTCAAGCGGGTAGGAAGCGCGGTCGGTGTCTTCTTGC<5GAGTGCCGACAAT 
G<X3CGTGGCTACCCCGATCTCATGATCGAGCGGGCTGGCCAC<5^ 

GC<5ATGCTGATCAAGCCCTTCACCGGACGGATTGTCTACTTCGA<3GACATTGGTGGGCGG 

CGTTACTCCTTCTTCGCACOTCAAATCTGGGT<^GCCAGCGCCGCATCTACATGCCGACG 

GCACAGATCTTTGGTACGCACCTCTCAAATGCGTATGAAATTCTGCGTCTGAATGATGAG 

ATCAGCGCCGGTCTGCTGACGATTAXX3GAGGCGGCAGTCGTGGCGTGGGA 

GAAGCACATCAGGCGATGTGGGAAAATGGCCACACX3GGGGCCACTTATGTGGTGAATCAT 

GCCTTACCAtSGTCTCGGCCTAAAGAACAGGGAC^ 

GAGCGGTAG (SEQ ID NO: 12 9) 
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SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



Figure 40 

1 midtaplappraprsnpirdrvdwe 

1 mglpeervrsgsgsrgqeeagaggrarswsp — ppevsrsahvpslqryr 

1 mslelkekeselpfdeqiind 

PL PP RS P 

26 aqraaaladpgafhgaiartvihwydpqhhcwirfnessqrwegldaatg 

4 9 elhrr sveepref wgdiake-f ywktpcpgpf lryn 

22 Jcwrs kytpidayfkfhrqtvenlesf— wesv 

R PF-GIATIWYPH RNES WE 

7 6 apvtvdypadyqpwqqaf ddseap- f yrwf s ggl tnacf nevdrhvm-mg 

84 fdvtkgkifievnnkgattnicynvldrnvhekk 

52 -akelew f kpwdkvldasnpp-fykwfvggrlnlsylavdrhvk-tw 

PW FD S P FY WF GG TN C N VDRHV 

124 ygdevayyfegdrwdnslnngrggpwqetitrrrllvewkaaqvlr-d 

117 Igdkva fywegne pge ttqi tyhqll vqvcqf snvlr- k 

96 rknklaiewegepvden gyptdrrkltyydlyrevnrvaymlkqn 

GD VA Y EG D G P IT LLVEV A VLR 



SEQ ID NO: 39 
SEQ ID NO: 130 
•SEQ ID NO: 131 



173- IglkkgdrialnnpniiDpqiyyte-aakrlgilytpvfggfsdktlsdri 
155 qgiqkgdrvai yinpmipe 1 vvaml-acar iga Ihs i vfagf s a e s leer i 
141 fgvkkgdkitlylp-mvpeipitaalaawrlgaitswfsgfsadalaeri 
G KKGDRIAL MP IP T AARGL VF GFS L RI 



'SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



222 hnagarvvitsdgayroa dkyipvetaqaiva 

204 ldsscsllittdafyrgeklvnlkel-adealqkcqekgfpvrc — ciw 

190 ndsqsrivitadgfwrrgrwrlkev 

R VIT DG YR W- KE D AL K PV IV 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



2€8 qtlatlpltesqrqtiiteveaalageitversdvmrgvgsalaklrdld 

251 khlgrael : -gmgdsts » 

216 vdaal 



V AAL 



G G 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



318 asvqakyrtvlaqalvespprveawwrhtg-qeilwnegrdrwshdll 

266 qsppikrscpdv qi s wnqgidlwwhelm 

221 ekatgvesvivlprlglkdvpmtegrdywwnklm 

ESPP VE V W G I WNEGRD W H L 



SEQ ID_NO:39 
SEQ ID NO: 130 
SEQ ID NO: 131 



367 

294 qea 

255 q 

A 



d — vdae 

— gde cepewedae 

gippn — — ~ayiepep~vese 

P D A I CEP VDAE 



SEQ TO NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



415 ypmftiytsgstgkpkgvihvhggyvagwhtlrvsfdacpgdtiyviad 
309 dplf ilytsgstglq)kgwhtvggynilyvattf kyvfdfhaedvf wctad 
272 hpsfilytsgttgJ^kgivhdtggwavhvyatmkwvfdirdddifwctad 
P FI YTSGSTGKPKGV H <3GY V T FD D AD 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



465 pgwitgqsyml^atmagrltgviaegsplfpsagryasiierygvqifka 
359 igwitghsyvtygplangatsvlfegiptypdvnrlwsivdkykvtkfyt 
3*22 igwvtghsywlgpllmgateviyegapdypqpdrwwsiierygvtifyt 
GWITG SY A T VI EG P P R SIIERY-GV IF 
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SEQ ID NO:39 
SEQ ID NO: 130 
SEQ ID NO: 131 



515 gvtflktvmsnpqnvedvrlydmhslrvatfcaepvspavqqfgmqimtp 
409 aptairllmkfgd — epvtkhsraslqvlgtvgepinpeawlwyhrwga 
372 sptairmfmryge — ewprkhdlstiriihsvgepinpeawrwayrvlgn 
T M E VR D SLRV EP P 



SEQ ID NO: 39 
SEQ 10 NO: 130 
SEQ ID NO: 131 



565 q yi nsywatehggivwthfygnqdfplrpdahtyplpwvmgdvw 

457 qrcpiv dtfwqtetgghmltplpgat — pmkpgsatfp ffgva 

420 e kvafgstwwmtetggivishapglylvpmkpgtngpplpgfevdv- 

Q W TE GGIV TH G PPT PLP DV 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



-609 vaetdesgttryrvadfdekgeivitapypyltrtlwgdvpgfeaylrge 

498 pallnesg eelegeaegylvf kqpwpgimrtvy 

466 .~^dengnp~-r-appgvkgylvikkpwpgmlhgtw 

A DESG A KG VI P P RT W 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



6S 9 iplrawkgdaer f vktywr rgpngewgyiqgdf aikypdgsft lhgrpdd 

531 gnherfettyfkkfpg yyvtgdgcqrdqdgyywitgridd 

496 gdperyiktywsrfpg mfyagdyaikdkdgyiwvlgrade 

GD ERF KTYfl R P Y GD AIK DG GR DD 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



709 vinvsghmgteeiegailr4rqitpdspvgnciwgaphrekgltpvaf 

57 1 mlnvsghlls taevesalve heavaeaawghphpvkgeclycf 

S3 6. Trikvaghrlgtyelesali shpavaesawgvpdaikgevpiaf 

' VINVSGHR GT E E A V WG PH KG P AF 



SEQ ID NO: 39 
SEQ-XD NO*43G- 
SEQ ID NO: 131 



759 igpapgrhltgadrrrldelvrtekgavsvpedyie-vsafpetrsgkym* 
—615 vtlcdghtfspkl-teelkkqirekigpiatp-dyiqnapglpktrsgkim 
580 wlkqgvapsdelrkelrehvrrtigpiaepaqi€€-vtklpktrsgkim 
G R L E VR G P DVI V P TRSGK M 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



808 rrflnunml-deplgdtttlrnpevleeiaaklaewkrrqrmaeeqqlie 

664 rr^lrkiaqndhdlgdmstvadpsvi 

629 rrllkavat-gaplgdvtt : 

RR LR D PLGD TT P V 



SEQ ID NO: 39 
-SEQ- ID NO : 130 
SEQ ID NO: 131 



857 ryryfrieyhpptasagklawtvtnppvnalneraldelntivdhlarr 

690 ---- r_ 

S47 



SEQ ID NO: 39 907 
SEQ ID NO: 130 690 
SEQ ID NO: 131" 647 



qdvaaivftgqgarsfvagadlrqlleeihtveeaxnalpnnahlafrkie 

. — shl 

— ledetsveeak— 

US VEEA HL 



SEQ ID NO: 39 957 rrnnkpciaaingvalggglefamachyrvadvyaefgqpetnlrllpgyg 

SEQ ID NO: 130 693 : 

SEQ ID NO: 131 658 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



1007 gtqrlprllykrnngtgllralemilggrsvpadealklglidaiatgdq 

693 -T- 

658 raye 

RA E 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



1057 dslslacalaraaigadgqliesaavtqafrhrheqldewrkpdprfadd 

693 fshr 

662 

F RR 
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SEQ ID NO: 39 1107 elrsiiahprieriirqahtvgrdaavhraldairygiihgfeaglehea 

SEQ ID NO: 130 *697 

SEQ ID NO: 131 "662 

SEQ ID NO: 39 1157 klfaeawdpnggkrgiref Idrqsaplptrrplitpeqeqllrdqkell 

SEQ ID NO: 130 697 ~ 

SEQ ID NO: 131 662 

SEQ ID NO: 39 1207 pvgspf fpgvdripkwqyaqavirdpdtgaaahgdpivaekqiivpverp 

SEQ ID NO: 130 697 : 

SEQ ID NO: 131 ^662 



SEQ ID NO: 39 1257 ranqaliyvlasevnfndiwaitgipvsrfdehdrdwhvtgsggigliva 

SEQ ID NO: 131 662 - : 

SEQ ID NO: 39 1307 lgeearregrlkvgdlvaiysgqsdllsplmgldpniaadfviqgndtpcig 

SEQ ID NO: 130 697 « 

SEQ ID NO: 131 662. 1 — 



SEQ ID NO: 39 13S7 shqqfmlaqapqclpiptdmsieaagsyilnlgtiyralf ttlqikagrt 

SEQ ID NO: 130 697 cl : tiq 

SEQ ID NO: 131 662 — eika 

CL T QIKA 

SEQ ID NO: 39 1407 if iegaatgtgldaarsaarnglrvigmvssssrastllaagahgainrk 

SEQ ID NO: 131 666 • 

SEQ ID NO: 39 1457 dpevadcf trvpedpsawaaweaagqpllamf raqndgrladywshage 

SEQ ID NO: 130 702 7 

SEQ ID NO: 39 1507 tafprsfqllgeprdghiptltfygatsgyhftf lgkpgsasptemlrxa 

SEQ ID NO: 130 702 

SEQ ID NO: 131 666 

SEQ ID NO: 39 1557 nlrageavliyygvgsddlvdtggleaieaarqmgariwvtvsdaqref 

SEQ ID NO: 130 702 . 

SEQ ID NO: 131 666 — 

SEQ ID NO: 39 1607 vlslgfgaalrgvvslaelkrrfgdefewprtinpplpnarqdpqglkeav 

SEQ ID NO: 130 702 

SEQ ID NO: 131 666 emart : 

E RT 

SEQ ID NO: 39 1-657 rrfndlvfkplgsavgvf Irsadnprgypdliieraahdalavsamlikp 

SEQ ID NO: 130 702 

SEQ ID NO: 131 671 
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SEQ ID NO: 39 1707 ftgrivyf ediggrrysffapqiwvrqrriymptaqifgthlsnayeilr 

SEQ ID NO: 130 702 ; 7 

SEQ ID NO: 131 *671 ' 1 



SEQ ID NO: 39 1757 lndeisaglltitepawpwdelpeahqamwenrhtaatywnhalprlg 

SEQ ID NO: 130 702 ' 

SEQ ID NO: 131 €71 : ~ 



SEQ ID NO: 39 1807 lknrdelyeawtager 

SEQ ID NO: 130 702 

SEQ ID NO: 131 671 ~ -~ 
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Figure 41 

SEQ ID NO: 39 1 midtaplappraprsnpirdrvdweaqraaaladpgafhgaiartvih*ry 

SEQ ID NO: 132 1 ~ 

SEQ ID WO: 133 1 



SEQ ID NO: 39 51 dpqhhcwirfnessqrwegldaatgapvtvdypadyqpwqqa£ddseapf 

SEQ ID NO: 132 1 

SEQ ID NO: 133 1 md 



SEQ ID NO: 39 101 yrwf sggltnacfnevdrhvmmgygdevayyfegdrwdnslnngrggpw 

SEQ ID NO: 132 1 melnxv 

SEQ ID NO: 133 3 fnnv 

FN V LNN 

SEQ ID NO: 39 151 qetitrrrllvevvkaaqvlrdlglkkgdrialnmpninpqtyyteaaJcr 
SEQ ID NO: 132 6 



SEQ ID NO: 133 7 llnkddgial 

L K D IAL 

SEQ -*D NO: 39 201' Igiiytpvf ggfsdktlsdrihnagarwitsdgayrnaqwpykeaytd 

SEQ ID NO: 132 6 ~ 

SEQ ID NO: 133 17 : 



SEQ ID NO: 39 251 qaldkyipvetaqaivaqtlatlpltesqrgtiiteveaalageitvers 

SEQ ID NO: 132 € vileke 

SEQ ID NO: 133 17 



I E E 

SEQ ID NO: 39 301 dvnurgvgsalaklrdldasvqakvrtvlaqalvesrerveavvvvrhtgq 

SEQ ID NO: 132 12 

SEQ ID NO: 133 17 ■ 



SEQ ID NO: 39 351 eilwnegrdrwshdlldaalakilanaraagfdvhsendllnlpddqlir 

SEQ ID NO: 132 12 

SEQ ID NO: 133 17 iiin— 

I N 

SEQ ID NO:39 401 alyasipcepvdaeypmf iiytsgstgkpkgvihvhggyvagwhtirvs 

SEQ ID NO: 132 12 

SEQ ID NO: 133 21 : 



SEQ ID NO: 39 451 fdaepgdtiyviadpgwitgqsymltatmagrltgviaegsplfpsagry 

SEQ ID NO: 132 12 

SEQ ID NO: 133 21 . 



SEQ ID NO: 39 501 asiierygvqif kagvtflktvmsnpqnvedvrlydmhslrvatf caepv 
SEQ ID NO: 132 12 



SEQ ID NO: 133 21 
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SEQ ID HO:39 551 spavqqfgmqimtpqyinsywatehggivwthfygnqdfplrpdahtypl 

SEQ ID NO: 132 12 

SEQ ID NO: 133 21 rpka « 

RP A 

SEQ ID NO: 39 €01 pwvmgdvwvaetdesgttryrvadfdekgeivitapypyltrtlwgdvpg 

SEQ ID NO: 132 12 

SEQ ID NO: 133 -25 

SEQ ID NO: 39 651 feaylrgeiplrawkgdaerfvktywrrgpngewgyiqgdf aikypdgsf 

SEQ ID NO: 132 12 

SEQ ID NO: 133 25 . 

V 

SEQ ID NO: 39 701 tlhgrpddvinvaghrmgteeiegailrdrqitpdspvgncivvgaphre 

SEQ ID NO: 132 12 

SEQ ID NO: 133 25 

SEQ ID NO: 39 751 Vegltpvafiqpapgrhltgadrrrldelvrtelcgavsvpedyievsafpe 

SEQ ID NO: 39 801 trsgkymrrf Irnmmldeplgdtttlrnpevleeiaakiaewkrrqrmae 

-SEQ ID NO: 132 12 ~ — k — i * 

SEQ ID NO: 39 851 eqqiieryryfrieyhpptasagklawtvtnpp-vnalneraldelnti 

SEQ ID NO: 132 12 gkvawtinrpkalnalnsdtlkemdyv 

SEQ ID NO: 133 25 ■ Inalnyetlkeldsv 

GK AWT P NALN L EL 

SEQ ID NO: 39 900 vdhlarrqdvaaivf tgqgarsfvagadirqlleeihtve-eamalpnna 

""SEQ ID NO: 132 40 igeiendsevlaviltgageksfvagadisexn-kemntiegrkfgilgnk 

SEQ ID NO: 133 40 ldivendkeikvliitgsgektfvagadiaemsn—mtpl-eakkf slyg 

D V A TG G SFVAGADI E T E EA N 

SEQ ID NO: 39 949 hlafrkiernmkpciaaingvalggglefaxaachyrvadvyaefgqpein 

SEQ ID NO: 132 89 --vfrrlellekpviaavngfalgggceiamscdiriassiiarfgqpevg 

SEQ ID NO: 133 87 qkvf rkiemlskpviaavhgf algggc«Xsmacdiriasknakfgqpevg 

FRKIE KP IAA NG ALGGG E AMAC R A A PGQPE 

SEQ ID NO: 39 999 lrllpgyggtqrlprllykrnngtgllralemilggrsvpadealklgli 

SEQ ID NO: 132 137 lgitpgfggtqrlsrlv gmgmakqliftaqnlkadealriglv 

SEQ ID NO: 133 137 lgiipgf sgtqrlprli gtskakeliftgdminsdeaykigli 

L PG GGTQRLPRL G A E I G ADEALK GLI 

SEQ ID NO: 39 1049 daiatgdqdslslacalaraaigadgqliesaavtqafrhrheqldewrk 

SEQ ID NO: 133 180 skw 

SEQ ID NO: 39 1099 pdprfaddelrsiiahprifiriirqahtvgrdaavhraldairygiihgf 

SEQ ID NO: 132 181 

SEQ ID NO: 133 184 elsdli 

EL I 
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"SEQ ZD NO: 39 1149 eagleheaklfaeawdpnggkrgirefldrqsaplptriTplitpeqeql 

SEQ 10 NO: 132 181. kweps— - el 

SEQ ID NO: 133 190 eeakklak 

EAK A W P L 

SEQ ID NO: 39 1199 lrdqkellpvgspf fpgvdripkwqyaqavirdpdtgaaahgdpivaekq 

SEQ ID NO: 132 189 mntakei 

SEQ ID NO: 133 198 -« ■ kmmsksq 

KE Q 

SEQ ID NO: 39 1249 iivpverpranqaliyvlasevnfndiwaitgipvsrfdehdrdwhvtgs 

SEQ ID NO: 132 196 ank ivsnapva 

SEQ ID NO: 133 20S i : 

I AN ' PV 

SEQ ID NO: 39 1299 ggiglivalgeearregrlkvgdlvaiysgqsdllsplmgldpmaadfvi 

SEQ ID NO: 132 207 vklskqainrgm 

SEQ ID NO: 133 206 aislakeainkg — 

V L EA G 

SEQ ID NO: 39 1349 qgndtpdgshqqfmlaqapqclpiptdmsieaagsyilnlgtiyralf tt 

SEQ ID NO: 132 219 qc-didtalafesea™ fgecfst 

SEQ ID NO: 133 218. metdld 

QC I TD E ' F T* 

SEQ ID NO: 39 1399 lqikagrtif iegaatgtgldaarsaarnglrvigmvssssrastllaag 

SEQ ID NO: 132 240 edqkdamtafie ■ 

SEQ ID NO: 133 224 tgntieaekfsl 

K T FIE TG A 

SEQ ID NO: 39 1449 ahgainrkd^evadcftrvpedpsawaaweaagqpllamfraqndgrlad 

SEQ ID NO: 132 252 

SEQ ID NO: 133 236 eft 

CFT 

SEQ ID NO: 39 1499 ^yv^ageta^rsfqllseprfgh^ 

SEQ ID NO: 132 252 

SEQ ID NO: 133 239 

SEQ ID NO: 39 1549 ptemlrranlrageavliyygvgsddlvdtggleaieaarqmgariwvt 
SEQ ID NO: 132 2S2 ~— 

SEQ ID NO: 39 1599 vsdaqrefvlslgfgaalrgvvslaelkrrfgdefewprtmpplpnarqd 

SEQ ID NO: 132 252 krk 

SEQ ID NO: 133 239 -tddqke gmiafse-kr 

D Q E G E KR 

SEQ ID NO: 39 1649 pqglkeavrrfndlvfkplgsavgvf lrsadnprgypdliieraahdala 

SEQ ID NO: 132 255 ie 

SEQ ID NO: 133 254 : 

IE 

SEQ ID NO: 39 r699 vsamlikpftgrivyfediggrrysffapqiwvrqrriyn^taqifgthl 

SEQ ID NO: 132 257 

SEQ ID NO: 133 254 apk fgk — 

AP FG 
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SEQ ID NO: 39 1749 snayeilrlndeisaglltitepawpwdelpeahqamwenrhtaatyw 

SEQ ID NO: 132 257 

SEQ ID NO: 133 260 



SEQ ID NO: 39 1799 nhalprlglknrdelyeawtager 

SEQ ID NO: 132 257 gfknr 

SEQ ID NO: 133 260 

G KNR 
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Figure 42 

SEQ ID NO: 39 1 midtaplappraprsnpirdrvdweaqraaaladpgafhgaiartvihwy 

SEQ ID NO: 134 1 ; 

SEQ ID NO: 135 1 



SEQ ID NO: 39 51 dpqhhcwirfnessqrwegldaatgapvtvdypadyqpwqqafddseapf 

SEQ ID NO: 134 1 • maasaap 

SEQ 10 AA AP 

SEQ ID NO: 39 101 yrwf sggltnacfnevdrhvxnmgygdevayyfegdrwdnslrmgrggpv^ 

SEQ ID NO: 39 151 qetitrrrllvevvkaaqvlrdlgllckgdrialnmpnin?>qiyyteaakr 

SEQ ID NO: 134 8 



SEQ ID NO: 135 1 



SEQ ID NO: 39 201- Igilytgyfggf sdktlsdrihnagarwitsdgayrnaqwpykeaytd 

SEQ ID NO: 134 8 j- awtg 

SEQ ID AT 

SEQ ID NO: 39 251 qaldkyipvetaqaivaqtlatlpltesqrqtiiteveaalageitvers 

SEQ ID NO: 134 12 q ■ taeak 

SEQ ID NO: 135 1 mtiqtlettalkd : - 

Q QTL T L T E 

SEQ ID NO: 39 301 dvmrgvgsalaklrdldasvqakvrtvlaqalvespprveavvvvrhtgq 

SEQ ID NO: 134 18 d ■ 

SEQ ID NO: 135 14 

D 

SEQ ID NO: 39 351 eilwnegrdrwshdlldaalakilanaraagfdvhsendllnlpddqiir 

SEQ ID NO: 134 19 ~ 

SEQ ID NO: 135 14 

SEQ ID NO: 39 401 alyasipcepvdaeypmf iiytsgstgkpkgvihvhggyvagwhtlrvs 

SEQ ID NO: 134 19 

SEQ ID NO: 135 14 r 

SEQ ID NO: 39 451 fdaepgdtiyviad^gwitgqsymltatmagrltgviaegsplfpsagry 

SEQ ID NO: 134 19 

SEQ ID NO: 135 14 « 

SEQ ID NO: 39 SOI asiierygvqifkagvtflktvmsr^>qnvedvrlydxnhslrvatfcaepv 

SEQ ID NO: 134 19 lyel . 

SEQ ID NO:13S 14 : — lyei 



G5/98 



BNSOOaD: <WO 024241 8A2L.1_> 



WO 02/42418 



PCTAJS01/43607 



SEQ ID NO: 39 551 spavqqfgmqimtpqyinsywatehggivwthf ygnqdfplrpdahtypl 

SEQ ID NO: 134 23 • : 

SEQ ID NO: 135 18 7 

-SEQ ID NO: 39 601 pvrmgdvwvaetdesgttryrvadfdekgeivitapypyltrtlwgdvpg 

SEQ ID NO: 134 23 

•SEQ ID NO: 135 18 

SEQ ID NO: 39 *651 feaylrgeiplrawkgdaerfvktywrrgpngewgyiqgdfaikypdgsf 

SEQ ID NO: 134 23 geip ; 

SEQ ID NO: 135 18 geip 

<5EIP 

SEQ ID NO: 39 701 tlhgrpddvinvsghrmgteeiegailrdrqitpdspvgnciwgaphre 

SEQ ID NO: 134 27 " 

SEQ ID NO: 135 22 

SEQ ID NO: 39 751 kgltpvaf iqpapgrhltgadrrrldelvrtekgavsvpedyievsafpe 

SEQ ID NO: 134 27 ' " 

SEQ ID NO: 135 22. pafhv P* 

P H P 

SEQ ID NO: 39 801 trsgkymrrf lrnmmldeplgdtttlrnpevleeiaakiaewlcrrqnnae 

-SEQ ID NO: 134 27 pig hvpakmyawairr 

SEQ ID NO: 135 29 t : - myawsirk 

T PLG AK W R 

SEQ ID NO: 39 851 eqqiieryryfrieyhpptasagklawtvtnppvnalneraldelntiv 

SEQ ID NO: 134 43 erh 

SEQ ID NO: 135 38 

ER 

SEQ ID NO: 39 901 dhlarrqdvaaivftgqgarsfvagadirqlleeihtveeamalpnnahl 

"SEQ ID "NO: 134 46 r 

SEQ ID NO: 135 38 T 

SEQ ID NO: 39 951 afrkiermnkpciaaingvalggglefaxnachyrvadvyaefgqpeinlr 

SEQ ID NO: 134 46 

SEQ ID NO: 135 38 erhgkp 

ER KP G PE 

SEQ ID NO: 39 1001 llpgyggtqrlprllykrnngtgllralemilggrsvpadealklglida 

SEQ ID NO: 134 50 

SEQ ID NO: 135 44 " 

SEQ ID NO: 39 1051 iatgdqdslslacalaraaigadgqliesaavtqaf rhrheqldewrkpd 

SEQ ID NO: 134 50 

SEQ ID NO: 135 44 tqamq™ 

TQA 

SEQ ID NO: 39 1101 prfaddelrsiiahprieriirqahtvgrdaavhraldairygiihgfea 

SEQ ID NO: 134 50 qsh 

SEQ ID NO: 135 49 

Q H 



66/98 



WO 02/42418 W W PCT/US01/43607 



SEQ ID NO: 39 1151 gleheaklfaeawdpnggkrgiref ldrqsaplptrrplitpeqeqllr 
SEQ ID NO: 134 53 > 



SEQ ID NO:13S 49 



SEQ ID NO: 39 1201 dqkellpvgspffpgvdripkwqyaqavirdpdtgaaahgdpivaekqii 

SEQ ID NO: 134 53 -qlevlpv wei gd 

SEQ ID NO: 135 49 vewptweige 

Q E LPV V P W GD 

SEQ ID NO: 39 1251 vpverpranqaliyvlasevnfndiwaitgipvsrfdehdrdwhvtgsgg 

SEQ ID NO: 134 "65 — - — devlvyvmaagvnyngvwaglgepispfdvhkgeyhiagsda 

SEQ ID NO: 135 60 devlvlvmaagvnyngvwaalgepispldghkqpfhiagsda 

L YV A VNN WA GPSFDH H GS 

i* 

SEQ ID NO: 39 1301 iglivalgeearregrlkvgdlvaiysgqsdllsp-lingldpm-aadfv- 

SEQ ID NO: 134 107 sgivwkvgakvk rwkvgdevivhcnqddgddeecnggdpm-f sptqr 

SEQ ID NO: 135 102 sgivwkvgakvk rwklgdevvihcnqddgddeecnggdfcxof sssqr- 

G G R KVGD VI Q D G DPM 

SEQ ID NO: 39 1348 iqgndtpdgshqqfmlaqapqclpiptdmsieaagsyilnlgtiyralf- 
SEQ ID NO: 134 153 iwgyetgdgsfaqfcrvqsrqlmarpkhltweeaacytltlatayrmlfg 
SEQ ID NO: 135 148 . iwgyetpdgsfaqf crvqsrqllprpkhltweesacytltlatayrmlf g 

I G TPDGS QF Q Q LP P EAYLLTYRLF 

SEQ ID NO: 39 1397 -ttlqikagrtif iegaatgtgldaarsaarnglrvigmvssssrastll 
SEQ ID NO: 134 203 haphtvrpgqnvllwgasgglgvfgvqlcaasganaiavisdeskrdyvm 
SEQ ID NO: 135 198 hkphelkpgqnvlvwgasgglgvf atqlaavaganaigwssedkrefvl 

K G I GA G G A AA G IG VSS S L 

SEQ ID NO: 39 1446 aagahgainrkdpevadcf trvpedpsawaaweaagqpllamfraqndgr 

SEQ ID NO: 134 253 slgakgvinrkd fdc w 

SEQ ID NO: 135 248 smgakavlnrge fncwgqlpk 

GA G INRKD DC P 

SEQ ID NO: 39 1496 ladywshagetafprsfqllgeprdghiptltf ygatsgyhftf lgkpg ' 

SEQ ID NO: 134 269 gqlptv 

SEQ ID NO: 135 269 vngpef 

G PT G F 

SEQ ID NO: 39 1546 sasptemlrranlrageavliyygvgsddlvdtggleaieaarqmgariv 

SEQ ID NO: 134 275 

SEQ ID NO: 135 275 1 



SEQ ID NO: 39 1596 wtvsdaqrefvlslgfgaalrgwslaelkrrfgdefewprtn^plpna 

SEQ ID NO: 134 275 ns 

SEQ ID NO: 135 275 ndymke srkfgkai-wqit 

D E R FG W T N 

SEQ ID NO: 39 1646 rqdpqglkeavrrfndlvfkplgsavgvf lrsadnprgypdliieraahd 

SEQ ID NO: 134 277 peyntwlkea-rkfgkaiwditgkgndv dlvfehpgea 

SEQ ID NO: 135 293 gnkdv- dmvfehpgeq 

GLKEA R F G V D E 

SEQ ID NO: 39 1696 alavsamli)q>£.tgrivyfediggrrysf fapqiwvrqrriymptaqifg 

SEQ ID NO: 134 314 *fpvstlvakr-ggroivf cagttgfnitfdaryvwmrqkriq g 

SEQ ID NO: 135 308 tfpvsvflvkr-ggmvvicagt-tgfnltmdarflwmrqkrvq g 

VSLKGIV G FAWRQRI <3 
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SEQ 10 NO: 39 1746 thlsnayeilrlndeisaglltitepawpwdeipeahqamwenrhtaat 
SEQ ID NO: 134 356 shfahlkqasaanqfvmdrrvdpcmsevfpwdkipaahtkmwknqhppgn 
SBQ ID NO: 135 350 shfanlmqasaanqlvidrrvdpclsevfpwdqipaahekmlanqhlpgn 

H N N V PWD P AH MW N H 

■SEQ ID NO: 39 1796 yvvnhalprlglknrdelyeawtager 
SEQ ID NO: 134 406 mavlvnstraglrtvedvieagplkam 
SEQ ID NO: 135 400 mavlvcaqrpglrtfeevqelsgap— 

V R <3L E EA A 
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Figure 46 
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Figure 47 
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Figure 48 
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Figure 49 

ATGGCGACGGGGGAGTGCATGAGG<SGAACAGGACGACTGGCAGGAAAGATTGCGTTAATT 
ACCGGTGGCGGGGGCAATATCGGCAGTGAATTGACACGTCGCTTTCTCGCAGAQGGAGGG 

agggtcattattagtggacggaat<:gggggaagtt<3acc<^:actggcggaacggatgcag 
gcagaggcaggagtgccggcaaagcgcatogatctc<3aagt-catggatgggagtgatcgg 

GTCGCGGTAGGTGGCGGTATGGAAGCGATTGTGGGCCGTCACGGCCAGATGGACATTCTG 
•GTCAACAATGCAGGAAGTGCCGGTGCCCAGGGTCGTCTGGCCGAGATTC 
GCTGAATTAGGOCCTGGCGCGGAAGAGACGCTTCATGCCAGCAT<KX^ 
ATGGGATGGCATCTGAT(X;GTATTGCGGCACCTCATATGCCGGTAGGAA<3TGCGGTCATC 
AATGTCTCGACCATCTTTTCAOGGGCTGAtSTACTACGGGCGGATTCCGTATGTCACCCCT 
AAA<^TGCTCTTAATCCTCTATCTCAACTTGCTGCGCG^^ 

cgcgttaatacgatctttcccggcccgattgaaagtgatcgcatccgtacagtgttccag 

cgtatggatcagctcaaggggcggcccgaaggcgacacagggcaccattttttgaacacc 

atgcgattgtgtcgtgccaacgaccagggcgggcttgaacgtcggttogcctccgtcggt 

gatgtggcagacgccgctgtctttctggccagtgccgaat^gccgctctctccggtgag 

acgattgaggttacgcacggaatggagttgccggcctgcagtgagaccagcctgctggcc 

cgtactgatctgcgcaggattgatgccagtggccgcacgacgctcatctgcgccggcgac 

cagattgaagaggtgatggcgctcagcggtatgttgcgtacctgtgggagtgaagtgatc 

atcggcttccgttcggctgcggcgctggcgcagttcgagcaggcagtcaatgagagtggg 

cggctggccggcgcagactttacgcctcccattgccttgccactcgatccaggcgatgcg 

gcaacaattgacgctgtcttcgattgggcgggcgagaataccggcgggattcatgcagcg 

gtgattctgcctgctaccagtcaggaacc^3cacggtccgtgattgaggttgatgatgag 

cgggtgctgaattttctggccgatgaaatcagggggacaattgtgattgccagtcgcctg 

gccggttactggcagtcgcaacggcttacgcccggcgcacgtgcgggtggggcgcgtgtc 

atttttctctc ^^ cggtcc <:gatcaaaaix^gaatgtttacggacgcattcaaagto 

gctatcggtcagctcattcgtgtgtggggtcacgaggctgaacttgactatcagcgtgcc 

agcgccgccggtgatcatgtgctgccgccggtatgggccaatcagattgtgggcttcgct 

aaccgcagccttgaagggttagaatttgcctgtgcctggacagctcaattgctccatagt 

caacggcatatcaatgagattaccctcaacatccctgccaacattagcgccaccaccggc 

gcacgcagtgcatcggtcggatgggcggaaagcctgatcgggttgcatttggggaaagtt 

gcgttgattaccggtggcagcgccggtattggtgggcagatcgggcgcctcctggctttg 

agtggcgggcgcgtgatgctggcagcccgtgatcggcataagctcgaacagatgcagggg 

atgatgcaatctgagctggctgaggtggggtataccgatgtggaagatcgggtccacatt 

gcaccgggctgcgatgtgagtagcgaagcgcagcttgcggatcttgttgaacgtaccctg 

tcagcttttggcaccgtcgattatctgatcaacaacgccgggatcgccggtgtcgaagag 

atggttatcgatatgccagttgagggatggcgccataccctcttcgccaatctgatcagc 

aactactcgttgatgggcaaactggcgccgttgatgaaaaaacagggtagcggttacatc 

cttaacgtctgatgatactttggcggtgaaaaagatgcggccattccctacgccaacggt 

GGCGATTACGCGGTCTCGAAGGCTGGTCAGCGGGCAATGGCCGAAGTCTTTGCGGGCTTC 
CTTGGCGGGGAGATACAGATCAATGCCATTGCGCCGGGTCCGGTGGAAGGTGATCGCTTG 
€GCGGTAGG^TGAACGTCC<:GGCCTCTTTGCGCGTGGGGCGCX3GCTGATTTTGGAGAAC 

aagcggctgaatgagcttcacgctgctcttatcgcggctgcgcgcaccgatgagcgatct 
atgcacg7^ctggttgaactgctcttacgcaatgatgtggcckx^ctagagcagaatccc 
gc^gcacgtacggggttgcgtgaactggcacgacgttttggcagcgaaggggatcgggcg 
gcatcatgaagcagtgggctgctgaaccgttcaattgc^^ 

CATAATGGTGGCTATGTGTTGCCTGCCGACATCTTTGCAAACCTGCCAAACCGGCCGGAT 
CGCTTCTTCACCCGAGCCGAGATTGATOGGGAGGGTCGCAAGGTTCGTGAGGGCATCATG 
GGGATGCTCTACCTGCAAGGGATGGCGACTGAGTTTGATGTCGCT^ATGGCCACGGTCTAT 
TACGTTGGGGACCX^AATGTCAGTGGTGAGACATTCCAGGCATCAGGTGGTTTGGGTTAC 
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GAACGCAGCCCTACCGGTCGCGAACTCTTCG 

CTGGTCGGAAGCACGGTCTATCTGATAGGTGAACATCTGACTGAACACCTTAACCTGCTT 
GCCCGTOCGTACCTCGAACGTTACGGGGCACGTCAGGTAGTGATGATTGTTGAGACAGAA 

accggggcagagacaatgc<;tcgcttgctccacgatcacgtcgaggctggtcggctgatg 

ACTATTGTGGCCGGTGATCAGATCGAAGCCGCTATCGACCAGGCTATCACTCGCTACGGT 

CGCCCAGGGCCGGTCGTCTGTACCCCCTTCCGGCCACTGCCGACGGTA<^ 

GGTAAA<3ACAGTX3AGTGGAGCACAGTGTTGA<3TGAtS^ 

CACCAGCTCACCCACCATTTCC<5GGTA«:GCGCAAGATTGCCCTGAGTGATGGTGCCAGT 

CTCGCGCTGGTCACTGCCGAAACTACGGCTACCTCAACTACCGAGCAATTTGCTCTGGCT 

AACTTCATCAAAACGACCCTTCAGGCTTTTAGGGCTACGATTGGTGTCGAGAGGGAAAGA 

ACTGCTCAGCGCATTCTGATCAATCAAGTC^ATCTCACCCGGCGTjGCGCGTGGCGAAGAG 

CGGCGTGATC€GCACGAGCGTCAACAAGAACTGGAACGTTTTATG<3AGGCAGTCTTGCTG 

<3TCACTGCACCACTCGCGCCTGAAGCCGATACCCGTTftCGGC^ 

OGGGCGATTACCGTGTAA <SEQ ID NO: 140) 
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FigureSO 

MATGESMSGTGRLAGKIALIT<3GA<5NI<3SELTRRFLMK3ATVIISGRNRAKLTALM:RMQ 

AEAGVPAKRIDLEVMDGSDPVAVRAGIBAIVARHGQIDILVNNAGSAGACRRLAEIPLTE 

AELGPGAEETLHASIANLLGMGWHLMRIAAPHMPVGSAVINVSTIFSRAEYYORIPYVtP 

KAALNALSQLAARELGARGIRVNTIFPGPIESDRIRTVFQRMDQLKGRPEGDTAHHFLNT 

MRLCRANDQGALERRFPSVGOVAOAAVFLASAESAALSGETIEVTHGMELPACSETSLLA 

RTDLRTIDASGRTTLICAGDQIEEVMALTGMLRTGGSEVIK3FRSAAALAQFEQAVNESR 

RLAGADFTPPIALPLDPROPATIDAVFDWAGENTGGIHAAVILPATSHEPAPCVIEVDDE 

RVLNFLADEITGTIVIASRLARYWQSQRLTPGARARGPRVIFLSNGADQNGNVyGRI<2SA 

AI<3<3LIRVWRHEAELDYQRASAAGDHVLPPVWANQIVRFANRSLEGLEFACAWTAQLLHS 

QRHINEITLNIPANISATTGARSASVGWAESLIGLHLGKVALITGGSAGIGGQIGRLLAL 

SGARVMLAARDRHKLEQMQAMIQSELAEVGYTDVEDRVHIAPGCDVSSEAQLADLVERTL 

SAFGTVDYLINNAGIAGVEEMVIDMPVEGWRHTLFANLISNYSLMRKLAPLMKRQGS<3YI 

LNVSSYFGGEKDAAIPYPNRADYAVSKAGQRAMAEVFARFLGPEIQINAIAPGPVEGDRL 

RGTGE RPGL FARRARL I LENKRLNE LHAALI AAARTDERS MHELVELLLPN DVAALEQN P 

AAPT ALRELARRFRSEGDP AAS S S S ALLNRS I AAKLLARLHNGGYVLP ADI FANLPN PP D 

PFFTRAQIDREARKVRDGIMGMLYLQRMPTEFDVAMATVYYLADRNVSGETFHPSGGLRY 

ERTPTGGELFGLPSPERLAEL VGSTV YLIGEHLTEHLNLLARAYLERYGARQWMIVETE 

TGAETMRRLiaDHVEAGRI^T^DQIEAAIDQAITRYGRPGPVVCTPFRPLPTVPLVG 

RKDSDWSTVLSEAEFAELCEHQLTHHFRVARKI'SLSDGASLALVTPETTATSTTEQFALA 

NFIKTTI1HAFTATIGVESERTAQRILINQVDLTRRARAEEPRDPHERQQELERFIEAVI1L 

VTAPLPPEADTRYAGRIHRGRAITV <SEQ ID NO: 141) 



77/98 



WO 02/42418 



PCT/US01/43607 



Figure 51 

GAATGGAGTTGCCGGCCTGCAGTGAGA€CAGCCTGCTGGC<XGT^^ 

TTGATGCCAGTGGCCGCACGACGCTCATCTGGGCCGGGGACCAGATTGAAGAGGTGATGG 

•GGCTCAC<X^TATGTTGCGTACCTGTGGGAGTGAAGTGATC^^ 

CGGCGCTGGCCCAGTTCGAGCAGGCAGTCAATGAGAGTCGGGGGCTGGGCGGGGCAGACT 
TTACGCCTCCCATTGGCTTGCCACTCGATCCACGCG <SEQ ID NO: 142) 
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Figure S2 

SEQ ID NO: 141 1 matgesmsgtgrlagkialltggagnigseltrrf laegatviisgrnra 

SEO ID NO: 143 1 mfankvvlvtggssgigaatveafvkegasvafvgrnqa 

SEQ ID NO: 144 1 mrlegkvclitgaasgigkattllf aqegatviagdiske 

SEQ ID NO: 145 1 

SEQ ID NO: 146 1 ***** 

SEQ ID NO: 147 1 mrllhkrtlvtggsdgiglaiaeaf lsegadvlivgrdaa 

SEQ ID NO: 141 51 kltalaermqa~«-agvpakridlevmclgscipvavragieaivarhgqi 

SEQ ID NO: 143 40 klkevesrcqq—hganilaikadv skdeeakiivqqtvdkfgkl 

SEQ ID NO: 144 41 nldslvk— ea— e — glp gkv 

SEQ ID NO: 145 1 : 

SEQ ID NO: 146 5 

SEQ ID NO: 147 41 kleaarqklaalgq-aga vetssadlatslgvatweqvketgrpl 

SEQ ID NO: 141 98 dilvnnagsagaqrrlaeiplteaelgpgaeetlhasianllgmgwhlmr 

SEQ ID NO: 143 83 dvlvnnagil rfasv— leptliqtfdetiontnlrpvv lits 

SEQ ID NO: 144 57 d 



SEQ ID NO: 145 1 

SEQ ID NO: 146 5 " 

SEQ ID NO: 147 86 dipinnagvadl vpfesv seaqfqhsfalnvaaaffltq 

SEQ ID NO:141 148 iaaphm-pvgsavinvstif sr-aeyygrip~yvtpkaalnalsqlaar 

SEQ ID NO: 143 123 laiphliatkgsivnvssilstivripgims-- ysvskaaxndhftklaal 

SEQ ID NO: 144 58 p«yv lnv 

SEQ ID NO: 145 1 " 

SEQ ID NO: 146 5 php-p — M " 

SEQ ID NO: 147 125 gllphf-gagasiinissyf ar-kmipkrpssvyslskgalnsltrslaf 
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SEQ ID NO: 141 394 tggihaavilpatshepapcvievddervlnf ladeitgtiviasrlary 

SEQ ID NO: 143 205 tg ahtp 

SEQ ID NO: 144 73 . 

SEQ ID NO: 145 29 lp : 

SEQ ID NO: 146 24 qplp * 

SEQ ID NO: 147 200 

SEQ ID NO: 141 444 wqsqrltpgarargprvif lsngadqngnvygriqsaaigqlirvwrhea 

SEQ ID NO: 143 211 

SEQ ID NO: 144 . 73 ? 

SEQ ID NO: 145 31 lsededyrgs— -gklk 

SEQ ID NO: 146 28 * dhg 



SEQ ID NO: 147 200 



SEQ ID NO: 141 494 eldyqrasaagdhvlppvwanqivrf anrsleglef acawtaqllhsqrh 
SEQ ID NO: 143 211 



SEQ ID NO: 144 73 

SEQ ID NO: 145 45 

SEQ ID NO: 146 31 ensyqgsgrlkd- 

SEQ ID NO: 147 200 ■ 



SEQ ID NO: 141 544 ineitlnipanisattgarsasvgwaesliglhlgkvalitggsagiggq 

SEQ-£D-N<H143 211 : -Igkaa- ■ r 

SEQ ID NO: 144 73' — r 

SEQ ID NO: 145 45 -« gkvaiitggdsgigra 

SEQ ID NO: 146 43 kraiitggdsgigra 

RP. Q TP NO: 147 200 --— ^"-Illpa —« - > 

SEQ ID NO: 141 594 igrllalsgarvmlaardrhlc-leqtaqaml^selaevgytdvedrvhiap 

SEQ ID NO: 143 216 qse 

SEQ ID NO: 144 73 ; 

SEQ ID NO: 145 61 aaiafakegadisilyldehsdaeetrkrieke nvrcllip 

SEQ ID NO: 146 58 vaiayaregadvlisylsehd damatkalve eagrkavlaa 

SEQ ID NO: 141 643 gcdvsseaqladlvertlsafgtvdylixmagiagveemvidta^vegwrh 

SEQ ID NO: 143 219 eiadmi : 

SEQ ID NO: 144 73 vekwqkygridvlvnnagitr-dallvrmkeedwda 

SEQ ID NO: 145 102 g-dvgdenhceqavqqtvdhfgkldilvnnaaeqhpqdsilnisteqlek 
SEQ ID NO: 146 99 g-diqssdhcrrivetavrelggidilvnnaahqatf kniedisdeewel 

SEQ ID NO: 147 204 — — eakaelkayvers 

SEQ ID NO: 141 693 tlfanlisnyslmrklaplmkkqgsgyilnvssyfggekdaaipypnrad 

SEQ ID NO: 144 109 vinvnlkgvfnvtqmwpymikqrngsivnvsswg — - — iygnpgqtn 

SEQ ID NO: 145 151 tfrtnifsmfhmtkkalphl—qegcaiinttsitayegdtal id 

SEQ ID NO: 146 148 tf rvnmhamfyltkaavphmkk-gsa-iintaai iiadvpnpilla 

SEQ ID NO: 147 217 

SEQ ID NO: 141 743 yavskagqramaevfarfl-gpe-iqinaiapgpvegdrlrgtgerpglf 

SEQ ID NO: 143 225 ; ■» 

SEQ ID NO: 144 154 yaaskagvigmtktwakelagrn-irvnavapgfie — 

SEQ ID NO: 145 194 ysstkgaivsftrsmaksl-adkgirvnavapgpi 

SEQ ID NO: 146 191 yattkgaihnfsaglaqml-aergirvnwapgpi 

SEQ ID NO: 147 217 yplgrigr 
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SEQ ID NO: 141 791 arrarlilenkrlnelhaaliaaartdersmhelvelllpndvaaleqnp 

SEQ ID NO: 143 225 : 

SEQ ID NO: 144 189 

SEQ ID NO: 145 228 wtp 

SEQ ID NO: 146 225 wtp lip s tmpedt va- df g k 

SEQ ID NO: 147 225 pddlagm 

SEQ ID NO: 141 841 aaptalrelarrf rsegdpaassssallnrsiaakllarlhnggyvlpad 

SEQ ID NO: 143 225 

SEQ ID NO:144 189 

SEQ ID NO: 145 231 lipatfpe 

SEQ ID NO: 146 244 qvp mkrpgqpvelasa yvmlad 

SEQ ID NO: 141 891 ifanlpnppdpf ftraqidrearkvrdgimgmlylqrraptefdvamatvy 

SEQ ID NO: 143 225 vy 

SEQ ID NO: 144 189 

SEQ ID NO: 145 239 — ekvkq 

SEQ ID NO: 146 266 pmssy 

SEQ ID NO: 147 232 ? av 

SEQ ID NO: 141 941 yladrnvsgetfhpsgglryertptggel£glpsperlaelvgstvylig 

SEQ ID NO: 143 227 .lasdk aksvtgscyi— 

SEQ ID NO: 144 189 tpmteklpekareta 

SEQ ID NO: 145 244 hgldtp 

SEQ ID NO: 146 271 vsgatiavtgg 

SEQ ID NO: 147 234 yla sdeaawtsggi 

SEQ ID NO: 141 991 ehltehlnllaraylerygarqvvmivetetgaetmrrllhdhveagrlm 

SEQ ID NO:143 242 

SEQ ID NO: 144 204 lsriplgrfgkpe — evaqvi 

SEQ ID NO: 145 250 

SEQ ID NO: 146 282 

SEQ ID NO:147 248 — 



SEQ ID NO: 141 1041 t ivagdqieaaidqai.tr ygrpgpwctpf rplptvplvgrkdsdwstvl 

SEQ ID NO: 143 242 

SEQ ID NO: 144 223 If lasdessyvtgqvi gidgglvi- 

SEQ ID NO: 145 250 ; mgrpgqpv 

SEQ ID NO: 146 282 kpfl 

SEQ ID NO: 147 248 favdggyt 

SEQ ID NO: 141 1091 seaefaelcehqlthhfrvarkialsdgaslalvtpettatstteqfala 

SEQ ID NO: 143 242 mdnglalq 

SEQ ID NO: 144 247 

SEQ ID NO: 145 258 eha gayvllasdes 

SEQ ID NO: 146 286 

SEQ ID NO: 147 2S6 



SEQ ID NO: 141 1141 nf ikttlhaftatigvesertaqrilinqvdltrraraeeprc^pherqqe 

SEQ ID NO: 143 250 

SEQ ID NO: 144 247 ' 

SEQ ID NO: 145 272 symtgqtihvn 

SEQ ID NO: 146 286 

SEQ ID NO: 147 256 — ' 
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SEQ ID NO: 141 1191 lerfieavllvtaplppeadtryagrihrgraitv 

SEQ ID NO:143 250 

SEQ ID NO: 144 247 

SEQ ID NO: 145 2B3 ggrfist 

SEQ ID NO: 146 286 " 

SEQ ID NO: 147 256 a 9" 
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Figure S3 

SEQ ID NO: 140 1 atggcgaegggggagtccatgagcggaacaggacgactggcaggaaagat 

SEQ ID NO: 148 1 atga ■ gacttctgcacaagcg 

SEQ ID NO: 14 9 1 atg ttcgcaaataaagt 

SEQ ID NO: 150 1 atgaggcttgaagggaaag™ 

SEQ ID NO: 151 1 r atggaaa 

SEQ ID NO: 152 1 

SEQ ID NO: 140 51 tgcgt-taattaccggtggcgccggcaatatcggcagtgaattgacacgt 

SEQ ID NO: 148 21 cacgc-tggtgaccggcggctc 

SEQ ID NO: 149 18 ggtac-tagtaacaggtggtagctccggtatcggc 

SEQ ID NO: 150 20 tgtgtctgatcacagg ggctgcaagcgggatagggaaa-gccacca 

SEQ ID NO: 151 8 aatttccgca t -occt 

SEQ ID N0:1S2*' 1 : 

SEQ ID NO: 140 100 cgctt — tctcgcagagggagcgacggtcattattagtggacggaatcgg 

SEQ ID NO: 148 42 ggacggtatcgg 

SEQ ID NO: 149 52 gcagctactgt 

SEQ ID NO: 150 65 cgcttcttttcgcacaggaag — ga 

SEQ ID NO: 151 22 -ccctt — tc • 

SEQ ID NO: 152 1 

SEQ ID NO: 140 148 gcgaagttgaccgcaetggccgaacggatgcaggcagaggcaggagtgcc 

SEQ ID NO: 148 54 cc tggcaatcgccgaggcgtfccctgagcgagg 

SEQ ID NO: 149 63 ggaagcattc 

SEQ ID NO: 150 88 gctacggtgatcg — ctggc --gat 

ID NO:lSl 29 

ID NO: 152 1 gtgaacccaatgg acaga — caaacagaaggacaag 

SEQ IDNO:140 198 ggcaaagcgcatcgatctcgaagtcatggatgggagtgatccggtcgcgg 

SEQ ID NO: 148 86 gcgc cgatgtcct 

SEQ ID NO: 149 73 gttaaggaagg 

SEQ ID NO: ISO 109 atctcga 

SEQ ID NO: 151 29 « 

SEQ ID NO: 152 35 aaccgcagc ■ atcagg 



SEQ ID NO: 140 248 tacgtgccggtatcgaagcgattgtggcccgtcacggccagatcgacatt 

SEQ ID NO: 148 99 gatcgtcggccgtgacgcc 

SEQ ID NO: 149 84 *:gcttctgtagccttcgtg 

SEQ ID NO: ISO 116 aagaaaatctcgactct 

SEQ ID NO: 151 29 cccgcca 

SEQ ID NO: 152 50 acagacagccgggcatt 

SEQ ID NO: 140 298 ctggtcaacaatgcaggaagtgccggtgcccagcgtcgtctggccgagat 

SEQ ID NO: 148 118 > <3<^ 

SEQ ID NO: 149 103 ggaagaaaccaagccaag— 

SEQ ID NO: ISO 133 <:ttgtgaaagaggcagaagg 

SEQ ID NO: 151 36 aacccaggaaatgcc 

SEQ ID NO: 152 €7 g-agtcaaaaatgaa -tccgctgcc 

SEQ ID NO: 140 348 tceactcactgaagctgaattaggccctggcgccgaagagacgcttcatg 

SEQ ID NO: 148 121 aagct ^cgaagccgcgc g 

SEQ ID NO: 149 121 cttaag— gaagtag agagccgc tg 

SEQ ID NO: 150 153 : 

SEQ ID NO: 151 51 

SEQ ID NO: 152 90 — ■ ' 
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<5EQ id NOU40 398 ccagcatcgccaatttaettggtatgggatggcatctgatgcgtattgcg 

SEQ 10 NO:148 138 ccagaagc ' tggcg 

SEQ ID NO: 149 144 ccagcagc 

SEQ ID NO: 150 153 -— - actt 

SEQ ID HO: 151 51 : c * 

SEQ ID NO: 152 90 getgtcagaggacgaggattatc 

SEQ ID NO: 140 448 gcacetcatatgccggtaggaagtgcggt<:ateaatgtctcgaccatctt 

SEQ ID NO: 148 151 gc tcttggcca 

SEQ ID NO: 149 152 ' atggagccaacatc-- 

SEQ ID NO: 150 157 ccgg—ggaag — 

SEQ ID NO: 151 53 gcac : ' 

SEQ ID NO: 152 113 g -aggaa 

SEQ ID NO:14d 498 ttcacgggctgagtaetacgggcggattccgtatgtcacccctaaagctg 

SEQ ID NO: 146 162 99C ™ 

SEQ ID NO: 149 166 ctggctatcaaag cagatgtctcc aaag 

SEQ ID NO: 150 166 ~ 

SEQ ID NO: 151 57 tac— cgatcggatgc- a ?^f? 

SEQ 



ID NO: 152 119 ■ gcgg " aaaactg 



SEQ ID NO: 140 548 ctcttaatgctctatctcaacttgctgcgcgtgagttaggtgcacgtggc 

SEQ ID NO: 148 165 —————— cggcgc « ggtggagacgtc 

SEQ ID NO: 149 194' acgagga 

SEQ ID NO: 150 166 r: ' 

SEQ ID NO: 151 76 c tgcccgat cacgggg- 

SQQ ID NO: 152 130 aaaggaa—- aagtt^— 

SEQ ID NO: 140 ' 598 atccgegttaatacgatctttcccggcccgattgaaagtgatcgcatecg 

SEQ ID NO: 148 183 gtccgc cgatcttgcc ~~ 

SEQ ID NO: 149 201 age gaaaatcatcgta 

SEQ ID NO: 150 166 gttgatccctacgtt ttgaacgtgaccg 

SEQ ID NO: 151 92 aaaac * tcct 

SEQ ID NO: 152 143 cgatcattactgg 

SEQ ID NO: 140 648 tacagtgttccagcgtatggatcagctcaaggggcggcccgaaggcgaca 

_SEQ ID NO: 148 199 ~" 

SEQ ID NO: 149 217 s « "'■ f 

SEQ ID NO: 150 194 -acag ggatcagataaag gaag 

SEQ ID NO: 151 101 accagggttcc ggacgcctgaag 

SEQ ID NO: 152 156 -: aggegaca 

SEQ ID -NO: 140 «98 cagcgcaecattttttgaacaocatgcgattgtgtcgtgccaacgaccag 

SEQ ID NO:148 199 accag 

SEQ ID NO: 149 217 caacaa 

SEQ ID NO: 150 215 ttgtggaaaa agtcgttcaaa ~ag 

SEQ ID NO: 151 124 gacaag 

SEQ ID NO: 152 ' 

SEQ ID NO: 140 748 ggcgcgcttgaaegtcggttcccctcegtcggtgatgtggcagacgccgc 

SEQ ID NO: 148 204 CCt 

SEQ ID NO: 149 223 — ' aC 

SEQ ID NO: 150 238 tacg gtcgaatc gatgt- ~ 

SEQ ID NO: 151 130 agagc — cateatcaccggcgggga cagcggcatc 

SEQ ID NO : 152 164 
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SEQ ID NO: 140 798 tgtctttctggccagt^ccgaatccgccgctctctccggtgagacgattg 

SEQ ID NO: 148 207 cggtgtcgcaaccgtcg-tcgagcaggtgaaa - 

SEQ ID NO: 149 225 tgtc gacaagttc gggaagcttg 

SEQ ID NO: 150 255 tctggtga 

SEQ ID NO: 151 163 gg cagggccgtggcga tcgcc 

SEQ ID NO: 152 164 

SEQ ID NO: 140 848 aggttacgcacggaatggagttgccggcctgcagtgagaccagcctgctg 

SEQ ID NO: 148 238 gagaccggcc 

SEQ ID NO: 149 248 atgt 

SEQ ID NO: 150 263 — 

SEQ ID NO: 151 184 tatgcgcgcgagggag — • ; c 

SEQ ID NO: 152 164 -gcggaat — r- agggagagc — — 

SEQ ID NO: 140 898 gcccgtactgatctgcgcacgattgatgccagtggccgcacgacgctcat 

SEQ ID NO: 148 248 ggccgctcgacattcct 

SEQ ID NO: 149 2S2 gcttgtt aacaacgc 

SEQ ID NO: 150 -263 ? acaacgc 

SEQ ID NO: 151 201 ggacgtccttatcagc tat 

SEQ ID NO: 140 948 ctgcgccggcgaccagattgaagaggtgatggcgctcaccggtatgttgc 

SEQ ID NO: 148 265 .at caacaatg — ccggt- 

SEQ ID NO: 149 2*7 — 

SEQ ID NO: 150 270 

SEQ ID NO: 151 220 ctgag cgagcatgacgacgcgatggccaccaaggct 

SEQ ID NO: 152 180 ~ 

SEQ ID NO: 140 998 gtacctgtgggagtgaagtgatcatcggctfcccgttcggctgcggcgctg 

SEQ ID NO: 148 280 gtcgccgacctc 

SEQ ID NO: 149 267 tgggatt ctacggtfccg 

SEQ ID NO: 150 270 gggaat 

SEQ ID NO: 151 256 ctggtggag-gaag 

SEQ ID NO: 152 180 

SEQ ID NO: 140 1048 gcccagttcgageaggcagtcaatgagagtcggcggctggccggcgcaga 

SEQ ID NO: 148 292 gtgccgttcga ■ gagcgtcagcg aggegca — 

SEQ ID NO: 149 284 cgagtgt tctggagccga 

SEQ ID NO: 150 276 

SEQ ID NO: 151 269 caggtcgc-aaggccgt gcttgccgccggcga 

SEQ ID NO: 152 180 agcag 

SEQ ID NO: 140 1098 ctttacgcctcccattgccttgccactcgatccacgcgatccggcaacaa 

SEQ ID NO: 148 321 gttccagcactcc 

SEQ ID NO: 149 302 cttta-- ataca aactt 

SEQ ID NO: 150 276 —aacaa 

SEQ ID NO: 151 300 c atccagtcg-tccg acca 

SEQ ID NO: 152 185 ctattgcctt ~ 

SEQ ID NO: 140 1148 ttgacgctg—tcttcgattgggccggcgagaataccggcgggattcatg 

SEQ ID NO: 148 334 ttcgcgctc aatgtggcgg cggcg 

SEQ ID NO: 149 317 ttga 

SEQ ID NO: 150 281 gggatgc gcttcttg 

SEQ ID NO: 151 318 ttgccgcaggatcgtcgaaacggccgttcgggaactcggcggcat 
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SEQ ID NO: 140 1196 cagcggtgattctgcctgctaccagtcacgaaccggcaccgtgcgtgatt 

SEQ ID NO: 148 358 ttcttcct -cacc 

SEQ ID NO: 149 321 tgaaact ; • 

SEQ ID NO: 150 29€ • ~ 

SEQ ID NO: 151 363 . 

SEQ ID NO: 152 195 tgcta t 



SEQ ID NO: 140 1246 gaggttgatgatgagcgggtgctgaattttctggccgatgaaatcaccgg 

SEQ ID NO: 148 370 caggggctgctgccgcattt 

SEQ ID NO: 149 328 atgaac acgaatttac — g 

SEQ ID NO: 150 296 tgag gatgaaa 

SEQ ID NO: 151 3€3 c 

SEQ ID NO: 152 ' 200 ■ aagagggggctga 

SEQ ID NO: 140 1296 gacaattgtgattgccagtcgcctggcccgttactggcagtcgcaacggc 

SEQ ID NO: 148 390 

SEQ ID NO: 149 345 tccagttgtcctcatcactagcctg 

SEQ ID NO: 150 307 ■ 

SEQ ID NO: 151 364 gaca 

SEQ ID NO: 152 213 

SEQ ID NO: 140 1346 ttacccccggcgcacgtgcgcgtgggccgcgtgtcatttttctctcgaac 

SEQ ID NO:148 390 --cggcgc c 

SEQ ID NO: 150 307 

SEQ ID NO:lSl 3G8 ? ttctcgtcaac 

SEQ ID NO: 152 213 tatctccattctat ac 

SEQ ID NO: 140 1396 ggtgccgatcaaaatgggaatgtttacggacgcattcaaagtgccgctat 

SEQ ID NO: 148 397 ggtgc- ' at 

SEQ ID NO: 149 370 gctat 

SEQ ID NO: 150 307 -gaagaagactgggatg 

SEQ ID NO: 151 379 aatgc 

SEQ ID NO: 152 229 ttagacgagca ttcggacgca • 

-SEQ ID NO: 140 144* cggtcagctcattcgtgtgtggcgtcacgaggctgaacttgactatcagc 

SEQ ID NO: 148 404 cgatca 

SEQ ID NO: 149 375 ccctcatttgatt gctacaaaagggag 

SEQ ID NO: 150 323 cggt aataaac 

SEQ ID NO: 151 384 

SEQ ID NO: 152 250 r ; gagg — aaac 

SEQ ID NO: 140 1496 gtgccagcgccgccggtgatcatgtgctgccgccggtatgggccaatcag 

SEQ ID NO: 148 410 " 

SEQ ID NO:149 402 

SEQ ID NO: 150 334 gtg aatc— - 

SEQ ID NO: 151 384 agcccatcag 

SEQ ID NO: 152 258 acgcaaacg — gate- gaaaaggag 

SEQ ID NO: 140 1546 attgtgcgcttcgctaaccgcagccttgaagggttagaatttgcctgtgc 

SEQ ID NO: 148 410 

SEQ ID NO: 149 40? 

SEQ ID NO: 150 341 tgaagggt 

SEQ ID NO: 151 394 ■ gcgaccttcaag 

SEQ ID NO: 152 280 aatgtccgctgc ctgcttatcc 



86/98 



WO 02/42418 



PCT/US01/43607 



SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO:149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 1*51 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
ID NO:148 
ID NO:149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 



1596 ctggaeagcteaattgctccatagtcaacgccatatcaatgagattaccc 

410 : 

402 catagt taacg tatccagtata 

349 gttttcaacg * 

406 - 

302 cggga 



SEQ 
SEQ 



SEQ 
SEQ 



164 6 tcaacatccctgccaacat tagcgccaccaccggcgcacgcagtgcatcg 
4 10 tcaacatctcttcctattt cgcccgca 

406 — aacatc gaagacatcagcgac 

307 

1696 gtcggatgggcggaaagcctgatcgggttgcatttggggaaagttgcctt 

437 

437 ' 

359 ■ 

427 — — — -gagga— - — 

307 gatg ttgggga — 

1746 gattaccggtggcagcgccggtattggtgggcagatcgggcgccfccctgg 

437 

437 

359 

432 gtggg 

318 



ID NO:140 
ID NO: 148 
ID NO: 149 
ID N0:150 
ID NO: 151 
ID NO: 152 



ctttgagtggcgcgcgcgtgatgctggcagcccgtgatcggcataagctc 



1796 
437 

437 

359 

318 



— agctgacattccg c 



1846 gaacagat gcaggcgatgatccaatctgagctggct gaggtggggt at ac 

437 ~ agatgatcc- — — 

440 gaatac 

359 ; tgactcagatgg 

451 gtcaacatgcacgccatgttc tac 

318 cga-gaaccattgtgaacaagctg 

189-6 cga tgtcgaaga tcgcgtccaca t tgcaccgggc tgcga t gtgagt agcg 

446 eg 

446 c 

371 . 

475 c — tgaccaag geagegg 

194 6 aagcgcagcttgcggktcttgttgaacgtaccctgtcagcttttggcacc 
448 aagcg . gecate — cage 

491 tgccgcaeatgaagaa gggcagc 

345 gcaaacagtggacc -attttggtaaa 
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SEQ ID NO: 140 1996 gtcgattatctga-tcaacaacgccgggatcgccggtgtcgaagagatgg 

SEQ ID NO: 148 463 gtctactccctgt-ccaagggcgc 1 

SEQ ID NO: 149 447 1 

SEQ ID NO: ISO 371 

SEQ ID NO: 151 514 g cga-tcatcaacaccg — 

SEQ ID NO: 152 370 ctcgat-atcttagtgaacaacgccg 

SEQ ID NO: 140 2045 ttatcgatatgccagttgagggatggcgccataccctcttcgccaatctg 

SEQ ID NO: 148 486 gttga 

SEQ ID NO: 149 447 — ■ agggattatgteatacagt 

SEQ ID NO: 150 371 r — 

SEQ ID NO: 151 530 cttcca tcaatgccgacgttcccaatccg 

SEQ ID NO:152 395 ■ ctg 

SEQ ID NO: 140 2095 atcagcaactactcgttgatgcgcaaactggcgccgttgatgaaaaaaca 

SEQ ID NO: 148 491 actcgttga 

SEQ ID NO: 149 466 • 

SEQ ID NO : 150 37 1 — — — - — — — tggtgccctacatgatcaaaca 

SEQ ID NO: 151 559 ate ctactcgcctatgcg — accacca 

SEQ ID NO: 152 398 aacagcatc — ccca 

SEQ ID NO: 140 2145 gggtagcggttacatccttaacgtctcatcatactttggcggtgaaaaag 

SEQ ID NO: 149 466 ■ 

SEQ ID NO: 150 393 gaggaacggttcgatcgtgaacgtctcctctgtcgttgg— aat 

SEQ ID NO: 151 584 agggegeg ate cacaattt 

SEQ ID NO: 152 411 ggacag cattctcaataCttcaaca 

SEQ ID NO: 140 2195 atgcggccattccctaccccaaccgtgccgattacgccgtctcgaaggct 

SEQ ID NO: 148 500 ccagatcgct 

SEQ ID NO: 149 466 gtgtcaaaggct 

SEQ ID NO: 150 435 ataegggaat cctggtcagacgaattacgcggcgtcgaaggcg 

SEQ ID NO: 151 603 cagcgccg gtctcg — ; 

SEQ ID NO: 152 436 



SEQ ID NO: 140 2245 ggtcagcgggcaatggccgaagtctttgcgcgcttccttggcccg ga 

SEQ ID NO: 148 510 ggecttcgag ctcggcccgcgcgg 

SEQ ID NO: 149 478 g 

SEQ ID NO: 150 478 ggagtcataggaatgacc-aagacgt 

SEQ ID NO: 151 617 cgcagatgctggccgaa cgcg g- 

SEQ ID NO: 152 436 gaacagctggaa ■ aaaacctttcgc- 

SEQ ID NO: 140 2292 gatacagatcaatgecattgcgccgggtccggtcgaaggtgatcgcttgc 

SEQ ID NO: 148 534 catccgcgtcaacgccatcgcgcccggcacggtcga™ — — — 

SEQ ID NO: 149 479 ; 

SEQ ID NO: 150 503 — gggcgaaggaactcgct- — 

SEQ ID NO: 151 639 gataagagtgaatgtcgtggccccgggcccgatc 

SEQ ID NO: 152 460 -acaaatattttttccat 

SEQ ID NO: 140 2342 gcggtaccggtgaacgtcccggcctctttgcccgtcgggcgcggctgatt 

SEQ ID NO: 148 570 * 

SEQ ID NO:149 479 ' 

SEQ ID NO: 150 520 

SEQ ID NO: 151 673 — tggacg.ccgctg 

SEQ ID NO: 152 477 
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SEQ ID NO: 140 2392 ttggagaacaagcggctgaatgagcttcacgctgctcttatcgcggctgc 

SEQ ID NO: 148 570 

SEQ ID NO: 149 479 

SEQ ID NO: 150 520 ggaagaaacatcagggtgaac gctgt 

SEQ ID NO: 151 68S atcccctccaccatgc -~ 

SEQ ID NO: 152 477 — gtttca 

SEQ ID NO: 140 2442 gcgcaccgatgagcgatetatgcacgaactggttgaactgctcttaccca 

SEQ ID NO: 148 570 

SEQ ID NO: 149 479 ctatg gatcacttcacaaaat 

SEQ ID NO: 150 546 g-gcacc cgga 

SEQ ID NO: 151 701 ccgagga 

SEQ ID NO: 152 483 ■ tatg-acgaa 

SEQ ID NO: 140 2492 atgatgtggccgcactagagcagaatcccgcagcacctaccgcgttgcgt 

SEQ ID NO: 148 570 cacc 

SEQ ID NO: 149 500 tggcagcgttggagctg gctccttctggcgtgcga 

SEQ ID NO: 150 556 ttcat ■ agaaacccccatgac 

SEQ ID NO: 151 708 taccg 

SEQ ID NO: 152 492 : gaaagctttgcct 

SEQ ID NO: 140 2542 gaactggcacgacgttttcgcagcgaaggcgatccggcggcatcatcaag 

SEQ ID NO: 148 574 . gccatgcggcg caag 

SEQ ID NO: 149 535 g 

SEQ ID NO: 150 576 cgaaaaacttccag — ' aaaaag 

SEQ ID NO: 151 713 tcgccgatttcg 

SEQ ID NO:152 505 cacctg " ; : caag 

SEQ ID NO: 140 2592 cagtgcgctgctgaaccgttcaattgccgctaaattgctggctcgtttgc 

SEQ ID NO: 148 589 : accgt 

SEQ ID NO: 149 536 tgaac tcagt 

SEQ ID NO: 150 596 c ccgtgaaacggcc 

SEQ ID NO: 151 725 gc 

"SEQ ID NO: 152 515 aggggtg tgccatta 

SEQ ID NO: 140 2642 ataatggtggctatgtgttgcctgccgacatctttgcaaacctgccaaac 

SEQ ID NO:148 594 cgac aacctgcc 

SEQ ID NO: 149 546 caaccctg 

SEQ ID NO: 150 €10 ctttccaga 

SEQ ID NO: 151 727 aaacaggtgcctatg 

SEQ ID NO: 152 530 ttaat acgacat 

SEQ ID NO: 140 2692 ccgcccgatcccttcttcacccgagcccagattgatcgcgaggctcgcaa 

SEQ ID NO: 148 606 

SEQ ID NO:149 554 gaccagttct 

SEQ ID NO: 150 619 atacc gctgggaa 

SEQ ID NO: 151 742 ' a* 

SEQ ID NO: 152 542 cgattaccgctt 

SEQ ID NO:140 2742 ggttcgtgacggcatcatggggatgctctacctgcaacggatgccgactg 

SEQ ID NO: 148 606 * ggccga 

SEQ ID NO: 149 564 tac: 

SEQ ID NO: ISO 632 ggtttgggaagccagaagagg 

SEQ ID NO: 151 744 g 

SEQ ID NO: 152 554 atgaaggggat acgg 
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SEQ ID NO: 140 2792 agtttgatgtcgcaatggccaccgtctattaccttgccgaccgcaatgtc 

SEQ ID NO: 148 €12 ggcca aggccgaactgaaggcc 

SEQ ID NO: 149 567 tgatatcgc ~ 

SEQ ID NO: 150 653 tggcgca 

SEQ ID NO: 151 745 cgaccg 

SEQ ID NO: 152 569 cgttaattgattattccagcacaaag — 

SEQ ID NO: 140 2842 agtggtgagaca-ttccacccatcaggtggtttgcgttacgaacgcaccc 

SEQ ID NO: 148 634 tatg tcgaacgcagc- 

SEQ ID NO: 149 . 576 

SEQ ID NO: 150 660 ggttatactcttcctcgcatcggacgagtcgagttacg 

SEQ ID NO: 151 751 : 

SEQ ID NO: 152 595 ggtgcga ; ttgtttcctttacg 

SEQ ID NO: 140 2891 ctaccggtggcgaactcttcggcttgccctcaccggaacggctggcggag 

SEQ ID NO: 148 649 tatccgctgggccgcatcgg-ccgtccggacgac 

SEQ ID NO: 149 576 1 ; a 9 

SEQ ID NO: 150 €98 tcaccggacagg 

SEQ ID NO: 151 751 ggccagccc gtggaa 

SEQ ID NO: 152 616 cgttccatggcgaagtc gcttgc 

SEQ ID NO: 140 2941 ctggtcggaagcacggtctatctgataggtgaacatctgactgaacacct. 

SEQ ID NO: 148 682 ctcgccggcatggcggtttatct ■ 

SEQ ID NO: 149 578 'ctggt tctggct 

SEQ ID NO: 150 710 ^---tgatag 

SEQ ID NO: 151 766 ctcg cctcggcctatgtcat 

SEQ ID NO: 152 €39 = — agataaa 



SEQ ID NO: 140 2991 taacctgcttgcccgtgcgtacctcgaacgttacggggcacgtcaggfcag 

SEQ ID NO: 148 705 ; 

SEQ ID NO:149 590 7 

SEQ ID NO:150 716 

ID NO: 151 786 



SEQ ID NO: 152 646 ggca- 



SEQ ID NO: 140 3041 tgatgattgttgagacagaaaccggggcagagacaatgcgtcgcttgctc 

SEQ ID NO: 148 705 

SEQ ID NO: 149 590 tt*ete 

SEQ ID NO: 150 716 7 ' 

SEQ ID NO: 151 786 

SEQ ID NO: 152 650 tcagagtgaatgcg 

SEQ ID NO: 140 3091" cacp~€cacgtcg 

SEQ ID NO: 148 705 

SEQ ID NO: 149 596 c tgatct 

SEQ ID NO: 150 716 

SEQ ID NO: 151 78€ gctgg 

SEQ ID NO: 152 664 gtggcgcccggt 

SEQ ID NO: 140 3141 gatcgaagccgctatcgaccaggctatcactcgctacggtcgcccagggc 

SEQ ID NO: 148 70S — agccagcgacgaggc ! 

SEQ ID NO: 149 603 gcttgaag- 



ID NO: 150 716 



ID NO: 151 791 cggatccgatgtcga gctac 

SEQ ID NO: 152 €76 ccgatttggacaccgct ' 
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SEQ ID NO: 140 3191 cggtcgtctgtacccccttccggccactgccgacggtaccactggtcggg 

SEQ ID NO: 148 720 : 

SEQ ID NO: 149 €11 

SEQ ID NO: 150 716 r ~ 

SEQ ID NO: 151 811 

SEQ ID NO: 152 693 tattccgg -cgacattccctgagg 

SEQ 10 NO: 140 3241 cgtaaagacagtgactggagcacagtgttgagtgaggctgaatttgccga 

SEQ ID NO: 148 720 ggcctgga cga 

SEQ ID NO: 149 611 -atacaggg 

SEQ ID NO: 150 716 -gaat 

SEQ ID NO: 151 811 gtgtcaggcgca ■ 

SEQ ID NO: 152 716 ; — aaaaagtga-aaeagcac < ggcttggatacecca 

SEQ ID NO: 140 3291 gttgtgcgaacaccagctcacccaccatttccgggtagcgcgcaagattg 

SEQ ID NO: 148 731 gcggtgggatc tttg 

SEQ ID NO: 149 619 gctcatacaccgt : — 

SEQ ID NO: ISO 720 ' — 

SEQ ID NO: 151 823 -acgafctg 

SEQ ID NO: 152 748 atgggaagaccgggacagcc ■ ggttgagc 

SEQ ID NOil40 3341 ccctgagtgatggtgc-cagtctcgcgctggtcactcccgaaactacggc 

SEQ ID NO: 148 746-ccgtg gatggt 

SEQ ID NO: 149 632 — -tggggaaagctgcgcagtct- — — — - 

SEQ ID NO: 150 720 agatgg 

SEQ ID NO: 151 830 ccgtga ~f 

SEQ ID NO: 152 776 atgcaggcgc-ctatgtCctgctggcgtctgacgaa — 

SEQ ID NO: 140 3390 tacctcaactaccgagcaatttgctctggctaacttcatcaaaacgaccc 

SEQ ID NO: 148 757 • < 

SEQ ID NO: 149 652 gaggagattgct 

SEQ ID NO: 150 726 

SEQ ID NO: 151 836 — r 

SEQ ID NO: 152 811 tcttccta 

SEQ ID NO: 140 3440 ttcacgcttttacggctacgattggtgtcgagagcgaaagaactgctcag 

SEQ ID NO: 148 757 ggcta 

SEQ ID NO: 149 664 « gatatgatt 

SEQ ID NO: 150 726 

SEQ ID NO: 151 836 

SEQ ID NO: 152 819 tatga -cag 

SEQ ID NO: 140 3490 cgcattctgatcaatcaagtcgatctgacccggcgtgcgcgtgccgaaga 

SEQ ID NO: 148 762 : 

SEQ ID NO: 149 €73 gtgtatctg gctagtgataaagc 

SEQ ID NO: 150 726 gg 

SEQ ID NO: 151 836 ~ 

SEQ ID NO: 152 827 ggca gaccattcatgt ; gaatg 

SEQ ID NO: 140 3540 gccgcgtgatccgcacgagcgtcaacaagaactggaacgttttatcgagg 

SEQ ID NO: 148 762 

SEQ ID NO: 149 696 taagagtgtt acggggtcctgttat 

SEQ ID NO: 150 728 gcctcgtgat 

SEQ ID NO: 151 836 : 

SEQ ID NO: 152 848 gcggc egttttat- 
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SEQ ID NO: 140 3590 cagtcttgctggtcactgcaccactcccgcctgaagccgatacccgttac 



SEQ ID NO: 148 762 

SEQ ID NO: 149 721 atcatggacaatg gactcgcgc 

SEQ ID NO: 150 738 -ctga 

SEQ ID NO: 151 836 ccggcggcaagcc— 

SEQ ID NO: 152 8$1 

SEQ ID NO: 140 3640 gccgggcggattcatcgcggacgggcgattaccgtgtaa 

SEQ ID NO: 148 762 cacggccggatga 

SEQ ID NO: 149 , 743 tgca gtaa 

SEQ ID NO: 150 742 

SEQ ID NO: 151 849 — -tttcctttga- 

SEQ ID NO: 152 861 ttcaac gtaa 
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Figured 

1 MVGKKWHHL MMSAKDAHYT GNLVNGARIV NQWGDVGTEL 
41 MVYVDGDISL FLGYKDIEFT APVYVGDFME YHGWIEKVGN 
81 QSYTCKFEAW KVATMVDITN PQDTRATACE PPVLGGRATG 
121 SLFIAKKDQR GPQESSFKER KHPGE (SEQ ID NO: 160) 
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Figure 57 

1 MVGKKWHHL MMSAKDAHYT GNLVNGARIV NQWGDVGTEL 
41 MVYVOGDISL FLGYKDIEFT APVYVGDFME YHGWIEKVGN 
81 QSYTCKFEAW KVAKMVDITN PQDTRATACE PPVLCGTATG 
121 SLFIAKDNQR GPQESSFKDA KHPQ {SEQ ID NO: 161) 
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Figure 58 

1 ATGGTAGGTA AAAAGGTTGT ACATCATTTA ATGATGAGCG . 

41 CAAAAGATGC TCACTATACT GGAAACTTAG TAAACGGCGC 

81 TAGAATTGTG AATCAGTGGG -GCGACGTTGG TACAGAATTA 

121 ATGGTTTATG TTGATGGTGA CATAAGCTTA TTCTTGGGCT 

161 ACAAAGATAT CGAATTCACA GCTCCTGTAT ATGTTGGTGA 

201 CTTTATGGAA TACCAGGGCT GGATTGAAAA AGTTGGTAAC 

241 CAGTCCTATA CATGTAAATT TGAAGCATGG AAAGTTGCAA 

281 CAATGGTTGA TATCACAAAT -CCTCAGGATA CACGCGCAAC 

321 AGCTTGTGAG CCTCCGGTAT TGTGCGGAAG AGCAAOGGGT 

361 AGTTTGTT-CA TCGCAAAAAA AGATCAGAGA GGCCCTCAGG 

401 AATCCTCTTT TAAAGAGAGA AAGCAOCCCG GTGAATGA 
(SEQ IO NO : 1*62 ) 
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Figure 59 

1 ATGGTAGGTA AAAAGGTTGT ACATCATTTA ATGATGAGCG 

41 CAAAAGATGC TCACTATACT GGAAACTTAG TAAACGGCGC 

81 TAGAATTGTG AATCAGTGGG GCGACGTAGG TACAGAATTA 

121 ATGGTTTATG TTGATGGTGA CATCAGCTTA TTCTTGGGCT 

161 ACAAAGATAT CGAATTCACA GCTCCTGTAT ATGTTGGTGA 

201 TTTTATGGAA TACCAGGGCT GGATTGAAAA AGTTGGCAAC 

241 CAGTCCTATA CATGTAAATT TGAAGCATGG AAAGTAGCAA 

281 AGATGGTTGA TATCACAAAT CCACAGGATA CACGTGCAAC 

321 AGCTTGTGAA CCTCCGGTAC TTTGTGGTAC TGCAACAGGC 

361 AGCCTTTTCA TCGCAAAGGA TAATCAGAGA GGTCCTCAGG 

401 AATCTTCCTT GAAGGATGCA AAGCACCCTC AATAA 

<SEQ ID NO: 163) 
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5 3-HYDROXVPROPIONIC ACID AND 

OTHER ORGANIC COMPOUNDS 

FIELD OF THE INVENTION 

The invention relates to enzymes and methods that can be used to produce organic 
10 acids and related products. 

CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims priority from the following U.S. Provisional Patent 
Applications, which are herein incorporated by reference: U.S. Provisional Patent 
15 Application Serial Number 60/252,123, filed November 20, 2000; U.S. Provisional Patent 
Application Serial Number 60/285,478, filed April 20, 2001; U.S. Provisional Patent 
Application Serial Number 60/306,727, filed July 20, 2001; and U.S. Provisional Patent 
Application Serial Number 60/3 17,845, filed September 7, 2001 . 

20 BACKGROUND 

Organic chemicals such as organic acids, esters, and polyols can be used to 
synthesize plastic materials and other products." To meet the increasing demand for 
organic chemicals, more efficient and cost effective production methods are being 
developed which utilize raw materials based on carbohydrates rather than hydrocarbons. 

25 For example, certain bacteria have been used to produce large quantities of lactic acid 
used in the production of polylactic acid. 

3-hydroxypropionic acid <3-HP) is an organic acid. Although several chemical 
synthesis routes have been described to produce 3-HP, only one biocatalytic route has 
been heretofore previously disclosed (WO 01/16346 to Suthers,-et at). 3-HP has utility 

30 for specialty synthesis and can be converted to commercially important intermediates by 
known art in the chemical industry, e.g., acrylic acid by dehydration, malonk acid by 
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oxidation, esters by esterification reactions with alcohols, and reduction to 1,3 
propanediol. 

SUMMARY 

5 The invention relates to methods and materials involved in producing 3- 

hydroxypropionic acid and other organic compounds (e.g., 1,3-propanediol, acrylic acid, 
polymerized acrylate, esters of acrylate, polymerized 3-HP, esters of 3 -HP, and malonic 
acid and its esters). Specifically; the invention provides nucleic acid molecules, 
polypeptides, host cells, and methods that can be used to produce 3-HP and other organic 
10 compounds such as 1,3-propanediol, acrylic acid, polymerized acrylate, esters of acrylate, 
polymerized 3-HP, esters of 3-HP, and malonic acid and its esters. 3-HP has potential to 
be both biologically and commercially important For example, the nutritional industry 
can use 3-HP as a food, feed additive or preservative, while the derivatives mentioned 
above can be produced from 3-HP. The nucleic acid molecules described herein can be 
15 used to engineer host cells with the ability to produce 3-HP as well as other organic 

compounds such as 1,3-propanediol, acrylic acid, polymerized acrylate, esters of acrylate, 
polymerized 3-HP, and esters of 3-HP. The polypeptides described herein can be used in 
cell-free systems to make 3-HP as well as other organic compounds such as 1,3- 
propanediol, acrylic acid, polymerized acrylate, esters of acrylate, polymerized 3-HP, and 
20 esters of 3-HP. The host cells described herein can be used in culture systems to produce 
large quantities of 3-HP as well as other organic compounds such as 1 ,3 -propanediol, 
acrylic acid, polymerized acrylate, esters of acrylate, polymerized 3-HP, and esters of 3- 
HP. 

One aspect of the invention provides cells that have lactyl-CoA dehydratase 
25 activity and 3-hydroxypropionyl-CoA dehydratase activity, and methods of making 
products such as those described herein by culturing at least one of the cells that have 
lactyl-CoA dehydratase activity and 3-hydroxypropionyl-CoA dehydratase activity. In 
some embodiments, the cell can also contain an exogenous nucleic acid molecule that 
encodes one or more of the following polypeptides: a polypeptide having El activator 
30 activity, an£2 a polypeptide that is a subunit of an enzyme having lactyl-CoA 

dehydratase activity; an £2 p polypeptide that is a subunit of an enzyme having lactyl- 
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Co A dehydratase activity; and a polypeptide having 3-hydroxypropionyl-CoA 
dehydratase activity. Additionally, the cell-can have€oA transferase activity, CoA 
synthetase activity, poly hydroxyacid synthase activity, 3-hydroxypropionyM:oA 
hydrolase activity, 3-hydroxyisobutryl-CoA hydrolase activity, and/or lipase activity. 
5 Moreover, the cell can contain at least one exogenous nucleic acid molecule that 
expresses one or more polypeptides that have CoA transferase activity, 3- 
hydroxypropionyl-CoA hydrolase activity, 3-hydroxyisobutryl-CoA hydrolase activity, 
CoA synthetase activity, poly hydroxyacid synthase activity, and/or lipase activity. 

In another embodiment of the invention, the cell that has lactyl-CoA dehydratase 
10 activity and 3-hydroxypropionyl-CoA dehydratase activity produces a product, for 
example, 3 -HP, polymerized 3-HP, and/or an ester of 3-HP, such as methyl 
hydroxypropionate, ethyl hydroxypropionate, propyl hydroxypropionate, and/or butyl 
hydroxypropionate. Accordingly, the invention also provides methods of producing one 
or more of these products. These methods involve culturing the cell that has lactyi-CoA 
1 5 dehydratase activity and 3-hydroxypropionyl-Co A dehydratase activity under conditions 
that allow the product to be produced. These cells also can have CoA synthetase activity 
and/or poly hydroxyacid synthase activity. 

Another aspect of the invention provides cells that have CoA synthetase activity, 
lactyl-CoA dehydratase activity, and poly hydroxyacid synthase activity. In some 
20 embodiments, these cells also can contain an exogenous nucleic acid molecule that 
encodes one or more of the following polypeptides: a polypeptide having El activator 
activity; an-£2 a polypeptide that is a subunit of an enzyme Jiaying lactyl-CoA 
dehydratase activity; an E2 p polypeptide that is a subunit of an enzyme having lactyl- 
CoA dehydratase activity; a polypeptide having CoA synthetase activity; and a 
25 polypeptide having poly hydroxyacid synthase activity. 

In another embodiment of the invention, the cell that has CoA synthetase activity, 
lactyl-CoA dehydratase activity, and poly hydroxyacid synthase activity can produce a 
product, for example, polymerized acrylate. 

Another aspect of the invention provides a cell comprising CoA transferase 
30 activity, lactyl-CoA dehydratase activity, and lipase activity. In some embodiments, the 
cell also can contain an exogenous nucleic acid molecule that encodes one or more of the 
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following polypeptides: a polypeptide having CoA transferase activity, a polypeptide 
having El activator activity, an E2 a polypeptide that is a subunit of an enzyme having 
laetyl-CoA dehydratase activity; an £2 p polypeptide that is a subunit of an enzyme 
having lactyl-CoA dehydratase activity; and a polypeptide having lipase activity. This 
5 cell can be used, among other things, to produce products such as esters of acrylate (e.g., 
methyl acrylate, ethyl acrylate, propyl acrylate, and butyl acrylate). 

In some embodiments, 1,3 propanediol can be created from either 3-HP-CoA or 3- 
HP via the use of polypeptides having enzymatic activity. These polypeptides can be 
used either in vitro or in vivo. When converting 3-HP^oA to 1 ,3 propanediol, 
1 0 polypeptides having oxidoreductase activity or reductase activity (e.g., enzymes from the 
1.1.1.- class of enzymes) can be used. Alternatively, when creating 1,3 propanediol from 
3-HP, a combination of (1) a polypeptide having aldyhyde dehydrogenase activity (e.g., 
an enzyme from the 1.1.1.34 class) and<2) a polypeptide having alcohol dehydrogenase 
activhy(e.g., an enzyme from the 1 . 1 .1 .32 class) can be used. 
15 In some embodiments of the invention, products are produced in vi«ro <outside of 

a cell). In other embodiments of the invention, products are produced using a 
combination of in vitro and in vivo (within a cell) methods. In yet other embodiments of 
the invention, products are produced in vivo. For methods involving in vivo steps, the 
cells can be isolated cultured cells or whole organisms such as transgenic plants, non- 
20 human mammals, or single-celled organisms such as yeast and bacteria (e.g, 

Lactobacillus, Lactococcus, Bacillus, and Escherichia cells). Hereinafter such cells are 
referred to as production cells. Products produced by these production cells can be 
"organic products such as 3-HP and/or the nucleic acid molecules and polypeptides 
described herein. 

25 Another aspect of the invention provides polypeptides having an amino acid 

sequence that (1) is set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161, 
(2) is at least 10 contiguous amino acid residues of a sequence set forth in SEQ ID NO:2, 
10, 18, 26, 35, 37, 39, 41, 141, 160, or 161, (3) has at least 65 percent sequence identity 
with at least 10 contiguous amino acid residues of a sequence set forth in SEQ ID NO:2, 

30 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161, (4) is a sequence set forth in SEQ IDNO:2, 
10, 18, 26, 35, 37, 39, 41, 141, 160, or 161 having-conservative amino acid substitutions, 
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or<5) has at least 65 percent sequence identity with a sequence set forth in SEQ ID NO:2, 
10, 18,36, 35, 37, 39, 41, 141, 160, or 161. Accordingly, the invention also provides 
nucleic acid sequences that encode any of the polypeptides described herein as well as 
specific binding agents that bind to any of the polypeptides described herein. Likewise, 
5 the invention provides transformed cells that contain any of the nucleic acid sequences 
that encode any of the polypeptides described herein. These -cells can be used to produce 
nucleic acid molecules, polypeptides, and organic compounds. The polypeptides can be 
used to catalyze the formation of organic compounds or can be used as antigens to create 
specific binding agents. 

10 In yet another embodiment, the invention provides isolated nucleic acid molecules 

that contain at least one of the following nucleic acid sequences: (1) a nucleic acid 
sequence as set forth in SBQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 
162, or 163; (2) a nucleic acid sequence having at kgst 10 consecutive nucleotides from a 
sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, 
15 of 163; (3) a nucleic acid sequences thaf^briffize under hybridization conditions <e.g., 
moderately or highly stringent hybridization conditions) to a sequence set forth in SEQ 
ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163; (4) a nucleic acid 
sequence having 65 percent sequence identity with at least 10 consecutive nucleotides 
from a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 
20 1 42, 1 627df T63rand (5)~a nucleic acid s^uence having at least 65 percent sequence 

identity with a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 163. Accordingly, the invention also provides a production cell that 
contains at least one exogenous nucleic acid having any the nucleic acid sequences 
provided above. The production cell can be used to express polypeptides that have an 
25 enzymatic activity such as CoA transferase activity, lactyl-CoA dehydratase activity, CoA 
synthase activity, dehydratase activity, dehydrogenase activity, malonyl CoA reductase 
activity, 0-alanine ammonia lyase activity, and/or 3-hydroxypropionyl-CoA dehydratase 
activity. Accordingly, the invention also provides methods of producing polypeptides 
encoded by the nucleic acid sequences described above. 
30 Hie invention also provides several methods such as methods for making 3-HP 

from lactate, phosphoenolpyruvate (PHP), or pyruvate. In some embodiments, methods 

5 . 
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for making 3-HP from lactate, PEP, or pyruvate involve culturing a cell containing at 
least one exogenous nucleic acid under conditions that allow the cell to produce 3-HP. 
These methods can be practiced using the various types of production cells described 
herein. In some embodiments, the production cells can have one or more of me following 
5 activities: Co A transferase activity, 3 -hydroxypropionyl-CoA hydrolase activity, 3 r 
hydroxyisobutryl-CoA hydrolase activity, dehydratase activity, and/or malonyl CoA 
reductase activity. 

In other embodiments, the methods involve making 3-HP wherein lactate is 
contacted with a first polypeptide having CoA transferase activity or CoA synthetase 

10 activity such that lactyl-CoA is formed, then contacting lactyl-CoA with a second 

polypeptide having lactyl-CoA dehydratase activity to form acrylyl-Co A, men contacting 
acrylyl-CoA with a third polypeptide having 3-hydroxypropionyl-CoA dehydratase 
activity to form 3-hydroxypropionic acid-CoA, and then contacting 3-hydroxypropionic 
acid-CoA with the first polypeptide to form 3-HP or with a fourth polypeptide having 3- 

15 hydroxypropionyl-CoA hydrolase activity or 3-hydroxy^sobutryl-CoA hydrolase activity 
to form 3-HP. 

Another aspect of the invention provides methods for making polymerized 3-HP. 
These methods involve making 3-hydroxypropionic acid-CoA as described above, and 
then contacting the 3-hyd^xyp^roplOmc-acid-CoA with a polypeptide having poly 
20 hydroxyacid synthase activity to form polymerized 3-HP. 

In yet another embodiment of the invention, methods for making an ester of 3-HP 
are provided. These methods involve making 3-HP as described above, and then 
additionally contacting 3-HP with a fifth polypeptide having lipase activity to form an 
ester. 

25 The invention also provides methods for making polymerized acrylate. These 

methods involve culturing a cell that has both CoA synthetase activity, lactyl-CoA 
dehydratase activity, and poly hydroxyacid synthase activity such that polymerized 
acrylate is made. Accordingly, the invention also provides methods of making 
polymerized acrylate wherein lactate is contacted with a first polypeptide having CoA 

30 synthetase activity to form lactyl-CoA, then contacting lactyl-CoA with a second 
polypeptide having lactyl-CoA dehydratase activity to form acrylyl-CoA, and then 
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contacting acrylyl-CoA with a third polypeptide having poly hydroxyacid synthase 
activity to form polymerized acrylate. 

The invention also provides methods of making an ester of acrylate. These 
methods involve culturing a cell that has Co A transferase activity, lipase activity, and 
5 lactyl-CoA dehydratase activity under conditions that allow the cell to produce an ester. 

In another embodiment, the invention provides methods for making an ester of 
acrylate, wherein acrylyl-CoA is formed as described above, and thai acrylyl-CoA is 
contacted with a polypeptide having CoA transferase activity to form acrylate, and 
acrylate is contacted with a polypeptide having lipase activity to form the«ster. 
10 The invention also provides methods for making 3-HP. These methods involve 

culturing a cell containing at least one exogenous nucleic acid that encodes at least one 
polypeptide such that 3-HP is produced from acetyl-CoA or malonyl-CoA. 

Alternative embodiments provide methods of making 3-HP, wherein acetyl-CoA 
is contacted with a first polypeptide having acetyl-Co A carboxylase activity to form 
15 mfloqypOb^'fi^ 

CoA reductase activity to form 3-HP. 

In other embodiments, malonyl-CoA can be contacted with a polypeptide having 
malonyl-CoA reductase activity so that 3-HP can be made. 

In another embodiment, the invention provides a method for making 3 
20 us^^^anine inteimgdiate: TlrisTnetfaod-c anbe pe ifui med b y ^irtacting ^alanine 
CoA with a first polypeptide haying 0-alanyl-CoA ammonia lyase activity (such as a 
polypeptide having the amino acid sequence set forth in SEQ ID NO: 160 or 161) to fonn 
acrylyl-CoA, contacting acrylyl-CoA with a second polypeptide having 3-HP-CoA 
dehydratase activity to form 3-HP-CoA, and contacting 3-HP-CoA with a third 
25 polypeptide having glutamate dehydrogenase activity to make 3-HP. 

Unless otherwise defined, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 
invention pertains. Although methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present invention, suitable 
30 methods and materials are described below. All publications, patent applications, patents, 
and other references mentioned herein are incorporated by reference in their entirety. In 

7 
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case of conflict, the present specification, including definitions, will control. In addition, 
the materials, methods, and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the 
following detailed description, and from the claims. 

5 

DESCRIPTION OF DRAWINGS 
Figure 1 is a diagram of a pathway for making 3-HP. 
Figure 2 is a diagram of a pathway for making polymerized 3-HP. 
Figure 3 is a diagram of a pathway for making esters of 3-HP. 
10 Figure 4 is a diagram of a pathway for making polymerized acrylic acid. 

Figure 5 is a diagram of a pathway for making esters of acrylate. 
Figure 6 is a listing of a nucleic acid sequence that encodes a polypeptide having 
CoA transferase activity (SEQ ID NO:l). 

Figure 7 is a listing of an amino acid sequence of a polypeptide having Co A 
1 5 transferase activity (SEQ ID N02). 

Figure 8 is an alignment of the nucleic acid sequences set forth in SEQ ID NOs.l, 

3, 4, and 5. 

Figure 9 is an alignment of the amino acid sequences set forth in SEQ ID NOs:2, 
6, 7, and 8. 

20 Figure 10 is a listing of a nucleic acid sequence that encodes a polypeptide having 

El activator activity {SEQ ID NO:9). 

Figure 1 1 is a listing of an amino acid sequence of a polypeptide having El 

activator activityTSEQ ID NO: 10). 

Figure 12 is an alignment of the nucleic acid sequences set form in SEQ ED 
25 NOs:9,ll,12,andl3. 

Figure 13 is an alignment of the amino acid sequences set forth in SEQ ID 

NOs:10, 14, 15, and 16. 

Figure 14 is a listing of a nucleic acid sequence that encodes an E2 a subunit of ai 
enzyme having lactyl-CoA dehydratase activity (SEQ ID NO:17). 
30 Figure 15 is a listing of an amino acid sequence of an E2 a subunit of an enzyme 

having lactyl-CoA dehydratase activity (SEQ ID NO:18). 
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Figure 16 is an alignment of the nucleic acid sequences set forth in SEQ ID 
NOs:17,19,20,and21. 

Figure 17 is an alignment of the amino acid sequences set forth in SfeQ ID 
NOs:18,22,23,and24. 
5 Figure 18 is a listing of a nucleic acid sequence that encodes an E2 P subunit of an 

enzyme having lactyl-CoA dehydratase activity <SEQ ID NO:25). The tt G" at position 
443 can be an tt A w ; and the "A" at position 571 can be a "G". 

Figure 19 is a listing of an amino acid sequence of an E2 0 subunit of an enzyme 
having lactyl-CoA dehydratase activity <SEQ ID NO:26). 
10 Figure 20 is an alignment of the nucleic acid sequences set forth in SEQ ID 

NOs:25,27,28,and29. 

Figure 21 is an alignment of the amino acid sequences set forth in SEQ ID 
NOs:26,30,31,and32. 

Figure 22 is a listing of a nucleic acid sequeneej>f genomic DNA from 
1 5 Megasphaera elsdenii {SEQ ID NO:33). 

Figure 23 is a listing of a nucleic acid sequence that encodes a polypeptide from 
Megasphaera elsdenii<SEQ ID NO:34). 

Figure 24 is a listing of an amino acid sequence of a polypeptide from 
Megasphaera elsdenii <SEQ ID NO:35). 
20 Figure 25 is a listing of a nucleic acid sequence that encodes a polypeptide having 

enzymatic activity <SEQ ID NO:36). 

Figure 26 is a listing of an amino acid sequence of a polypeptide having 
enzymatic activity <SEQ ID NO:37). 

Figure 27 is a listing of a nucleic acid sequence that contains non-coding as well 
25 as coding sequence of a polypeptide having CoA synthase, dehydratase, and 

dehydrogenase activity (SEQ ID NO:38). The start site for the coding sequence is at 
position 480, a ribosome binding site is at position 466-473, and the stop codon is at 
position 5946. 

Figure 28 is a listing of an amino acid sequence from a polypeptide having CoA 
30 synthase, dehydratase, and dehydrogenase activity (SEQ ID NO:39). 
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Figure 29 is a listing of a nucleic acid sequence that encodes a polypeptide having 
3-hydroxypropionyl-CoA dehydratase activity (SEQ ID NO:40). 

Figure 30 is a listing of an amino acid sequence of a polypeptide having 3- 
hydroxypropionyl-CoA dehydratase activity (SEQ ID NO:41). 
5 Figure 3 1 is a listing of a nucleic acid sequence that contains non-coding as well 

as coding sequence of a polypeptide having 3-hydroxypropionyl-CoA dehydratase 
activity (SEQ ID NO:42). 

Figure 32 is an alignment of the nucleic acid sequences set forth in SEQ ID 

NOs:40,43,44,and45. 
10 Figure 33 is an alignment of the amino acid sequences set forth in SEQ ID 

NOs:41,46,47,and48. 

Figure 34 is a diagram of the construction of a synthetic operon (pTDH) that 
encodes for polypeptides having CoA transferase activity, lactyl-CoA dehydratase 
activity (El, E2.o, andE2 p)^dJ^ 
15 CoA dehydratase). 

Figure 35 A and B is a diagram of the construction of a synthetic operon (pHTD) 
that encodes for polypeptides having CoA transferase activity, lactyl-CoA dehydratase 
activity (El, E2 a, and E2 P), and 3- hydrox ypropionyl- CoA d ehydratase activity {3 -HP- 

CoA dehy^atase)._ _ - 

20 Figure 36A and B is a diagram of the construction of ^synthetic operon 

(pEHTHrEI) that encodes for polypeptides having CoA transferase activity, lactyl-CoA 
dehydratase activity (El, E2 a, and E2 P), and 3-hydro 
activity (3-HP-CoA dehydratase). 

Figure 37A and B is a diagram of the construction of a synthetic operon 
25 (pEHTHEI) that encodes for polypeptides having CoA transferase activity, lactyl-CoA 
dehydratase activity (El, E2 a, and E2 P), and 3-hydroxypropionyl-CoA dehydratase 
activity (3 -HP-Co A dehydratase). 

Figure 38A and B is a diagram of the construction of two plasmids, pEDTH and 
pPROEI. The pEIITH plasmid encodes polypeptides having CoA transferase activity, 
30 lactyl-CoA dehydratase activity <E2 a and E2 p), and 3-hydroxypropionyl-CoA 
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dehydratase activity (3-HP-CoA dehydratase), and the pPROEI plasmid encodes a 
polypeptide having El activator activity. 

• Figure 39 is a listing of a nucleic acid sequence that encodes a polypeptide having 
Co A synthase, dehydratase, and dehydrogenase activity (SEQ ID NO: 129). 
5 Figure 40 is an alignment of the amino acid sequences set forth in SEQ ID 

NOs:39, 130, and 131. The uppercase amino acid residues represent positions where that 
amino acid residue is present in two or more sequences. 

Figure 41 is an alignment of the amino acid sequences set forth in SEQ ID 
NOs:39, 132, and 133. The uppercase amino acid residues represent positions where that 
1 0 amino acid residue is present in two or more sequences. 

Figure 42 is an alignment of the amino acid sequences set forth in SEQ ID NOs: 
39, 134, and 135. The uppercase amino acid residues represent positions where that 
amino acid residue is present in two or more sequences. 

Figure 43 is a diagram of several pathways for making organic compounds vising 
1 5 the multifunctional OS17 enzyme. 

Figure 44 is a diagram of a pathway for making 3-HP via acetyl-CoA and 
malonyl-CoA. 

Figure 45 is a diagram of pMSD8, pET30a/accl, pFN476, and PET286 constructs. 
Figure 46 contains a total ion chromatogram and five mass spectrums of 
20 Coenzyme A thioesters. Panel A is total ion chromatogram illustrating the separation of 
Coenzyme A and four CoA-organic thioesters: l=Coenzyme A, 2=lactyl-CoA, 3=acetyl- 
CoA, 4=acrylyl-CoA, 5=propionylrCoA. Panel B is a mass spectrum of Coenzyme A. 
Panel C is a mass spectrum of lactyl-CoA. Panel D is amass spectrum of acetyl-Co A. 
Panel E is a mass spectrum of acrylyl-CoA. Panel F is a mass spectrum of propionyl- 
25 CoA. 

Figure 47 contains ion chromatograms and mass spectrums. Panel A is a total ion 
chromatogram of a mixture of lactyl-CoA and 3-HP-CoA. The Panel A insert is the mass 
spectrum recorded under peak 1. Panel Bis a total ion chromatogram of lactyl-Co A. The 
Panel B insert is the mass spectrum recorded under peak 2. In each panel, peak 1 is 3- 
30 HP-€oA, and peak 2 is lactyl-CoA. The peak labeled with an asterisk was confirmed not 
to be a CoA ester. 
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Figure 48 contains ion chromatograms and mass spectrums. Panel A is a total ion 
chromatogram of CoA esters derived fiom a broth produced by K coli transfected with 
pEUTHrEI. The Panel A insert is the mass spectrum recorded under peak 1 . Panel B is a 
total ion chromatogram of CoA esters derived £om a broth produced by eontrol K coli 
5 not transfected with pEHTHrEI The Panel B insert is the mass spectrum recorded under 
peak 2. In each panel, peak 1 is 3-HP-CoA, and peak 2 is laetyl-Co A. The peaks labeled 
with an asterisk were confirmed not to be a CoA ester. 

Figure 49 is a listing of a nucleic acid sequence that encodes a polypeptide having 
malonyl-CoA reductase activity (SEQ ID NO: 140). 
10 Figure 50 is a listing of an amino acid sequence of a polypeptide having malonyl- 

CoA reductase activity (SEQ ID NO:141). 

Figure 51 is a Usting of a nucleic acid sequence that encodes a portion of a 
polypeptide having malonyl-CoA reductase activity (SEQ ID NO: 142). 

Figure 52 is an alignment of the amino acid sequences set forth in SEQ ID NOs: 

15 141, 143, 144, 145, 146, and 147. 

Figure 53 is an alignment of the nucleic acid sequences set forth in SEQ ID NOs: 
140, 148, 149, 150, 151, and 152. 

Figure 54 is a diagram of a pathway for making 3-HP via a ^alanine intermediate. 
Figure 55 is a diagram of a pathway for making 3-HP via a ^alanine intermediate. 
20 Figure 56 is a listing of an amino acid sequence of a polypeptide having p-alanyl- 

CoA ammonia lyase activity (SEQ ID NO:160). 

Figure 57 is a Usting of an amino acid sequence of a polypeptide having p-alanvl- 
CoA ammonia lyase activity (SEQ ID NO: 161). 

Figure 58 is a Usting of a nucleic acid sequence thatencodes a polypeptide having 
25 P-alanyl-CoA ammonia lyase activity (SEQ ID NO: 162). 

Figure 59 is a listing of a nucleic acid sequence that can encode a polypeptide 
having p-alanyl-CoA ammonia lyase activity (SEQ ID NO:163). 
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DETAILED DESCRIPTION 

I. Terms 

Nucleic acid: The teim "nucleic acid" as used herein encompasses both RNA and 
DNA including, without limitation, cDNA, genomic DNA, and synthetic (e.g., chemically 

5 synthesized) DNA* The nucleic acid can be double-stranded or single-stranded Where 
single-stranded, the nucleic acid can be the sense strand or the antisense strand. In 
addition, nucleic acid can be circular or linear. 

Isolated: The term "isolated" as used herein with reference to nucleic acid refers 
to a naturally-occurring nucleic acid that is not immediately contiguous with both of the 

1 0 sequences with which it is immediately contiguous (one on the 5 * end and one on the 3 * 
end) in the naturally-occurring genome of the organism from which it is derived. For 
example, an isolated nucleic acid can be, without limitation, a recombinant DNA 
molecule of any length, provided one of the nucleic acid sequences normally found 
immediately flanking that recombinant DNA molecule in a naturally-occurring genome is 

1 5 removed or absent Thus, an isolated nucleic acid includes, without limitation, a 

recombinant DNA that exists as a separate molecule (e.g., a cDNA or a genomic DNA 
fragment produced by PCR or restriction endonuclease treatment) independent of other 
sequences as well as recombinant DNA that is incorporated into a vector, an 
^ autonomously replicating plasmid, a virus (e.g., a retrovirus, adenovirus, or herpes virus), 

20 or into the genomic DNA of a prokaryote or eukaryote; In addition, an isolated nucleic 
acid can include a recombinant DNA molecule that is part of a hybrid or fusion nucleic 
acid sequence. 

- — The-term delated" as used herein with reference to nucleic acid also includes any 
non-naturally-occurring nucleic acid since non-naturally-occurring nucleic acid sequences 

25 are not found in nature and do not have immediately contiguous sequences in a naturally- 
occurring genome. For example, non-naturally-occurring nucleic acid such as an 
engineered nucleic acid is considered to be isolated nucleic acid. Engineered nucleic acid 
can be made using common molecular cloning or chemical nucleic acid synthesis 
techniques. Isolated non-naturally-occurring nucleic acid can be independent of other 

30 sequences, or incorporated into a vector, an autonomously replicating plasmid, a virus 
(e.g., a retrovirus, adenovirus, or herpes virus), or the genomic DNA of a prokaryote or 
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eukaryote. In addition, a non-naturauy-occurring nucleic acid can include a nucleic acid 
molecule that is part of a hybrid or fusion nucleic acid sequence. 

It will be apparent to those of skill in the art that a nucleic acid existing among 
-hundreds to millions of other nucleic acid molecules within, for example, cDNA or 
5 genomic libraries, or gel slices containing a genomic DNA restriction digest is not to be 
considered an isolated nucleic acid 

Exogenous: The term "exogenous" as used herein with reference to nucleic acid 
and a particular cell refers to any nucleic acid that does not originate from that particular 
cell as found in nature. Thus, non-naturally-occurring nucleic acid is considered to be 
10 exogenous to a cell once introduced into the cell. Nucleic acid that is naturally-occurring 
also can be exogenous to a particular cell. For example, an entire chromosome isolated 
from a cell of person X is an exogenous nucleic acid with respect to a cell of person Y 
once that chromosome is introduced into Y's cell. 

Hybridization: The term "hybridization" as used herein refers to a method of 
15 testing for complementarity in the nucleotide sequence of two nucleic acid molecules, 
based on the ability of complementary single-stranded DNA and/or RNA to form a 
duplex molecule. Nucleic acid hybridization techniques can be used to obtain an isolated 
nucleic acid within the scope of the invention. Briefly, any nucleic acid having some 
homology to a sequence set forth in SEQ ID NO;l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
20 140, 142, 162, or 163 can be used as a probe to identify a similar nucleic acid by 
hybridization under conditions of moderate to high stringency. Once identified, die 
nucleic acid then can be purified, sequenc ed , and analyzed to determine whether it is 
within the scope of the invention as described herein. 

Hybridization can be done by Southern or Northern analysis to identify a DNA or 
25 RNA sequence, respectively, that hybridizes to a probe. The probe can be labeled with a 
biotin, digoxygenin, an enzyme, or a radioisotope such as n ?. The DNA or RNA to be 
analyzed can be electrophoretically separated on ah agarose or polyacrylamide gel, 
transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the 
probe using standard techniques well known in the art such as those described in sections 
30 7.39-7.52 of Sambrook et al., (1989) Molecular Cloning, second edition, Cold Spring 
Harbor Laboratory, Plainview, NY. Typically, a probe is at least about 20 nucleotides in 
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length, For example, a probe corresponding to a 20 nucleotide sequence set forth in SEQ 
ID NO: 1, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, or 142 can be used to identify an 
identical or similar nucleic acid. In addition, probes longer or shorter than 20 nucleotides 
can be used. 

5 The invention also provides isolated nucleic acid sequences that are at least about 

12 bases in length {e.g., at least about 13, 14, 15, 16, 17, 1 8, 19, 20, 25, 30, 40, 50, 60, 
100, 250, 500, 750, 1000, 1500, 2000, 3000, 4000, or 5000 bases in length) and hybridize, 
under hybridization conditions, to the sense or antisense strand of a nucleic acid having 
the sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 

10 162, or 163. The hybridization conditions can be moderately or highly stringent 
hybridization conditions. 

For the purpose of this invention, moderately stringent hybridization conditions 
mean the hybridization is performed at about 42°Cin a hybridization solution containing 
25 mM KP0 4 (pH 7.4), 5X SSC, 5X Denhart's solution, 50 jig/mL denatured, sonicated 

15 salmon sperm DNA, 50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 
5x1 0 7 cpm/|ig), while the washes are performed at about 50°C with a wash solution 
containing 2X SSC and 0.1% sodium dodecyl sulfate. 

Highly stringent hybridization conditions mean the hybridization is performed at 
about 42°C in a hybridization solution containing 25 mM KPO4 {pH 7.4), 5X SSC, 5X 

20 Denhart's solution, 50 \ig/mL denatured, sonicated salmon sperm DNA, 50% foimamide, 
10% Dextran sulfate, and 1-15 ng/mL probe (about 5xl0 7 cpm/jig), while the washes are 
performed at about 65°C with a wash solution containing 0.2X SSC and 0.1% sodium 
dodecyl sulfate. 

Purified: The term "purified" as used herein does not require absolute purity; 

25 rather, it is intended as a relative term. Thus, for example, a purified polypeptide or 
nucleic acid preparation can be one in which the subject polypeptide or nucleic acid, 
respectively, is at a higher concentration than the polypeptide or nucleic acid would be in 
its natural environment within an organism. For example, a polypeptide preparation can 
be considered purified if the polypeptide content in the preparation represents at least 

30 50%, 60%, 70%, 80%, 85%, 90%, 92%, 95%, 98%, or 99% of the total protein-content of 
the preparation. 

15 . 
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Transformed: A "transfonned" cell is a cell into which a nucleic acid molecule 
has been introduced by, for example, molecular biology techniques. As used herein, the 
term "transformation" encompasses all techniques by which a nucleic acid molecule 
might be introduced into such a cell including, without limitation, transfection with a viral 
5 vector, conjugation, transformation with a plasmid vector, and introduction of naked 
DNA by electroporation, lipofection, and particle gun acceleration. 

Recombinant: A "recombinant" nucleic acid is one having (1) a sequence that is 
not naturally occurring in the organism in which it is expressed or <2) a sequence made by 
an artificial combination of two otherwise-separated, shorter sequences. This artificial 
10 combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering 
techniques. "Recombinant? is also used to describe nucleic acid molecules that have been 
artificially manipulated, but contain the same regulatory sequences and coding regions 
that are found in the organism from which title nucleic acid was isolated. 
15 Specific binding agent: A "specific binding agent" is an agent that is capable of 

specifically binding to any of the polypeptide described herein, and can include 
polyclonal antibodies, monoclonal antibodies (including humanized monoclonal 
antibodies), and fragments of monoclonal antibomes such^asFab, F(ab')2, and Fv 
fragments as well as any other agent capable of specifically bmdjng to an epitope of such 
20 polypeptides. 

Antibodies to the polypeptides provided herein (or fragments thereof) can be used 
to purify or identify such polypeptides. The amino acid and nucleic acid sequences 
provided herein allow for the production of specific antibody-based binding agents that 
recognize the polypeptides described herein. 

25 Monoclonal or polyclonal antibodies can be produced to the polypeptides, 

portions of the polypeptides, or variants thereof. Optimally, antibodies raised against one 
or more epitopes on a polypeptide antigen will specifically detect that polypeptide. That 
is, antibodies raised against one particular polypeptide would recognize and bind that 
particular polypeptide, and would not substantially recognize or bind to other 

30 polypeptides. The determination that an antibody specifically binds to a particular 
polypeptide is made by any one of a number of standard immunoassay methods; for 
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instance, Western blotting (See, e.g., Sambrook*/ al. <ed.), Molecular Cloning: A 
Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring 

Harbor, N.Y., 1989). 

To determine that a given antibody preparation {such as a preparation produced in 

5 a mouse against a polypeptide having the amino acid sequence set forth in SEQ ID NO: 
2) specifically detects the appropriate polypeptide <e.g., a polypeptide having the amino 
acid sequence set forth in SEQ ID NO: 2) by Western blotting, total cellular protein can 
be extracted from cells and separated by SDS-polyacrylamide gel electrophoresis. The 
separated total cellular protein can then be transferred to a membrane (e.g., 

10 nitrocellulose), and the antibody preparation incubated with the membrane. After 
washing the membrane to remove non-specifically bound antibodies, the presence of 
specifically bound antibodies can be detected using an appropriate secondary antibody 
(e.g., an anti-mouse antibody) conjugated to an enzyme such as alkaline phosphatase 
since application of 5 -bromo-4-chloro-3 -indolyl phosphate/nitro blue tetrazolium results 

15 in the production of a densely blue-colored compound by immuno-localized alkaline 
phosphatase. 

Substantially pure polypeptides suitable for use as an immunogen can be obtained 
from transfected cells, transformed cells, or wild-type cells. Polypeptide concentrations 
in the final preparation can be adjusted, for example, by concentration on an Amicon 

20 filter device, to the level of a few micrograms per milliliter. In addition, polypeptides 
ranging in size from full-length polypeptides to polypeptides having as few as nine amino 
acid residues can be utilized as immunogens. Such polypeptides can be produced in cell 
culture, can be chemically synthesized using standard methods, or can be obtained by 
cleaving large polypeptides into smaller polypeptides that can be purified. Polypeptides 

25 having as few as nine amino acid residues in length can be immunogenic when presented 
to an immune system in the context of a Major Histocompatibility Complex (MHC) 
molecule such as an MHC class I or MHC class II molecule. Accordingly, polypeptides 
having at least 9, 10, 1 1, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,35, 60, 70, 80, 90, 100, 
150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 900, 1000, 1050, 

30 1 100, 1 150, 1200, 1250, 1300, 1350, or more consecutive amino acid residues of any 
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amino acid sequence disclosed herein can be used as immunogens for producing 
antibodies. 

Monoclonal antibodies to any of the polypeptides disclosed herein can be 
prepared from murine hybridomas according to the classic method of Kohler & Milstein 

5 Mature 256:495 (1975)) or a derivative method thereof. 

Polyclonal antiserum containing antibodies to the heterogeneous epitopes of any 
polypeptid e disclos ed herein can be prepared by immunizing suitable animals with the 
polypeptide tor fragment thereof), which can be unmodified or modified to enhance 
immunogenicity. An effective immunization protocol for rabbits can be found in 

10 Vaitukaitb er a/. (J.C/in.£^ 

Antibody fragments can be used in place of whole antibodies and can be readily 
expressed in prpkaryotic host cells. Metiiods of making and using immunologically 
effective portions of monoclonal antibodies, also referred to as "antibody fragments," are 
well known and include those described in Better & Horowitz {Methods Enzymol. 

15 178:476-496 (1989)), Glockshuber et al. (Biodtemfcfry 29:1362-1367 (1990), U.S. Pat 
No. 5,648,237 ("Expression of Functional Antibody Fragments"), U.S. Pat. No. 4,946,778 
("Single Polypeptide Chain Binding Molecules"), U.S. Pat No. 5,455,030 
("Immunotherapy Using Single Chain Polypeptide Binding Molecules"), and references 

cited therein. 

20 Operably linked: A first nucleic acid sequence is "operably linked" with a 

second nucleic acid sequence whenever the first nucleic acid sequence is placed in a 
functional relationship with the second nucleic acid sequence. For instance, a promoter is 
operably linked to a coding sequence if the promoter affects the transcription of the 
coding sequence. -Generally, operably linked DNA sequences are contiguous and, where 

25 necessary to join two polypeptide-coding regions, in the same reading frame. 

Probes and primers: Nucleic acid probes and primers can be prepared readily 
based on die amino acid sequences and nucleic acid sequences provided herein. A 
"probe" includes an isolated nucleic acid containing a detectable label or reporter 
molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, 

30 and enzymes. Methods for labeling and guidance in the choice of labels appropriate for 
various purposes are discussed in, for example, Sambrook et al. (ed), Molecular Cloning: 
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A Laboratory Manual 2nd-ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, N.Y., 1989, and Ausubel et al (ed.) Current Protocols in Molecular 
Biology, Greene Publishing and Wiley-lnterscience, New York (with periodic updates), 
1987. 

5 "Primers" are typically nucleic acid molecules having ten or more nucleotides 

<e.g., nucleic acid molecules having between about 10 nucleotides and about 100 
nucleotides). A primer can be annealed to a complementary target nucleic acid strand by 
nucleic acid hybridization to form a hybrid between the primer and the target nucleic acid 
strand, and then extended along the target nucleic acid strand by, for example, a DNA 
10 polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 

sequence, for example, by the polymerase chain reaction (PCR) or other nucleic-acid 
amplification methods known in the art 

Methods for preparing and-using probes and primers are described, for example, 
in references such as Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 
15 2nd <*l ,v»l 1 -3, Cold Sprin g Harbor Laborat ory Press, Cold Spring Harbor, N.Y., 1989; 
Ausubel et al. (ed.), Current Protocols in Molecular Biology, Greene Publishing and 
Wiley-lnterscience, New York (with periodic updates), 1987; and Innis et al., PCR 
' Protocols: A Guide to Methods and Applications, Academic Press: San Diego, 1990. PCR 
primer pairs can be derived from a known sequence, for example, by using computer 
20 j?rogi?.!ps intended for that purpos e such as P rimer (V ersio n 0.5, .COP YRGT. 1991, 

Whitehead Institute for Biomedical Research, Cambridge, Mass.). One of skill in the art 
will appreciate that the specificity of a particular probe or primer increases with the 
length, but that a probe or primer can range in size from a full-length sequence to 
sequences as short as five consecutive nucleotides. Thus, for example, a primer of 20 
25 consecutive nucleotides can anneal to a target with a higher specificity than a 

corresponding primer of only 15 nucleotides. Thus, in order to obtain greater specificity, 
probes and primers can be selected that comprise, for example, 10, 20, 25, 30, 35, 40,^0, 
60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 
850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 
30 1600, 1650, 1700, 1750, 1800, 1850, 1900, 2000,2050, 2100, 2150, 2200, 2250, 2300, 
• 2350, 2400, 2450, 2500, 2550, 2600, 2650, 2700, 2750, 2800, 2850, 2900, 3000, 3050, 
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3100, 3150, 3200, 3250, 3300, 3350, 3400, 3450, 3500, 3550, 3600, 3650, 3700, 3750, 
3800, 3850, 3900, 4000, 4050, 4100, 4150, 4200, 4250, 4300, 4350, 4400, 4450, 4500, 
4550, 4600, 4650, 4700, 4750, 4800, 4850, 4900, 5000, 5050, 5100, 5150, 5200, 5250, 
53O0, 53 50, 5400, 5450, or more consecutive nucleotides. 
5 Percent sequence identity: The "percent sequence identity" between a particular 

nucleic acid or amino acid sequence and a sequence referenced by a particular sequence 
identification number is determined as follows. First, a nucleic acid or amino acid 
sequence is compared to the sequence set forth in a particular sequence identification 
number using the BLAST 2 Sequences (B12seq) program from the stand-alone version of 
10 BLASTZ containing BLASTN version 2.0.14 and BLASTP version 2.0.14. This stand- 
alone version of BLASTZ can be obtained from Fish & Richardson's web site 
(www.fr.com) or the United States government's National Center for Biotechnology 
Information web site (www.ncbijum.nih.gov). Instructions explaining how to use the 
B12seq program can be found in the readme file accompanying BLASTZ. B12seq 
15 performs a comparison between two sequences using either the BLASTN or BLASTP 
algorithm. BLASTN is used to compare nucleic acid sequences, while BLASTP is used 
to compare amino acid sequences. To compare two nucleic acid sequences, the options 
are set as follows: -i is set to a file containing the first nucleic acid sequence to be 
compared (e.g., C:\seql.txt); -j is set to a file containing the second nucleic acid sequence 
20 to be compared (e.g., C:\seq2.txt); -pu^^ 

(e.g., C:\outputtxt); -q is set to -1 ; -r is set to 2; and all other options are left at their 
default setting. For example, the following command can be used to generate an output 
file containing a comparison between two sequences: C:\B12seq -i c:\seql.txt-j 
c:\seq2.txt -p blastn -o c:\outputtxt -q -1 -r 2. To compare two amino acid sequences, 
25 the options of B12seq are set as follows: -i is set to a file containing the first amino acid 
sequence to be compared (e.g., C:\seql .txt); -j is set to a file containing the second amino 
acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set to any desired 
file name<e.g., C:\outputtxt); and all other options are left at their default setting. For 
example, the following command can be used to generate an output file containing a 
30 comparison between two amino acid sequences: C:\B12seq -i c:\seql .txt -j x:\seq2.txt -p 
blastp -o c:\outputtxt the two compared sequences share homology, then the 
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designated output file will present those regions of homology as aligned sequences. If the 
two compared sequences do not share homology, then the designated output file will not 
present aligned sequences. 

Once aligned, the number of matches is determined by counting the number of 

5 positions where an identical nucleotide or amino acid residue is presented in both 
sequences. The percent sequence identity is determined by dividing the number of 
matches either by the length of the sequence set forth in the identified sequence (e.g., 
SEQ ID NO:l), qr by an articulated length (e.g., 100 consecutive nucleotides or amino 
acid residues from a sequence set forth in an identified sequence), followed by 

10 multiplying the resulting value by 100. For example, a nucleic acid sequence that has 
1 166 matches when aligned with the sequence set forth in SEQ ID NO:l is 75.0 percent 
identical to the sequence set forth in SEQ ID NO: 1 <i.e., 1 166+1554* 1 00=75.0). It is 
noted that the percent sequence identity value is rounded to the nearest tenth. For 
example, 75.11, 75.12, 75.13, and 75.14 is rounded dovvn to 75.1, while 75.15, 75.16, 

15 75.17, 75.18, and 75.19 is rounded up to 75.2. It is also noted that the length value will 
always be an integer. In another example, a target sequence containing a 20-nucleotide 
region that aligns with 20 consecutive nucleotides from an identified sequence as follows 
contains a region that shares 75 percent sequence identity to that identified sequence (i.e., 
15-20*100=75). ~— 

20 1 20 

Target Sequence: AGGTCGTGTACTGTCAGTCA 

I II III I II I I I I I I 

Identi-f ied Sequence:- ACGW3GTGAACTGCC&GTCA 

25 Conservative substitution: The term "conservative substitution" as used herein 

refers to any of the amino acid substitutions set forth in Table 1. Typically, conservative 
substitutions have little to no impact on the activity of a polypeptide. A polypeptide can 
be produced to contain one or more conservative substitutions by manipulating the 
nucleotide sequence that encodes that polypeptide using, for example, standard 

30 procedures such as site-directed mutagenesis or PCR. 
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Table! 



Original 


Conservative 


Residue 


Substitution(s) 


Ala 


ser 


Arg 


lys 


Asn 


gin; his 


Asp 


glu 


Cys 


ser 


Gin 


asn 


Glu 


asp 


Gly 


pro 


His " " 


asn; gin 


He 


leu; val 


Leu 


ile; val 


Lys 


arg; gin; glu 


Met 


leu; ile 


Phe 


met; leu; tyr 


Ser 


thr 


Thr 


ser 


Tip 


tyr 


Tyr 


trp;phe 


Val 


" ile; leu 



II. Metabolic Pathways 

The invention provides methods and materials related to producing 3-HP as well 
5 as other organic compounds {e.g., 1 ,3 -propanediol, acrylic acid, polymerized acrylate, 
esters of acrylate, polymerized 3-HP, and esters of 3-HP). Specifically, the invention 
provides isolated nucleic acids, polypeptides, host cells, and methods and materials for 
producing 3-HP as well as other organic compounds such as 1,3-propanediol, acrylic 
acid, polymerized acrylate, esters of acrylate, polymerized 3-HP, and esters of 3-HP. 
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Accordingly, the invention provides -several metabolic pathways that can be used 
to produce organic compounds from PEP (Figures 1-5, 43-44, 54, and 55). As depicted in 
Figure 1, lactate can be converted into lactyl-CoA by a polypeptide having CoA 
transferase activity (EC 2.8.3.1); the resulting lactyl-CoA can be converted into acrylyl- 
5 CoA by a polypeptide (or multiple polypeptide complex such as an activated E2 a and E2 
p complex) having lactyl-CoA dehydratase activity (EC 4.2.1.54); the resulting acryiyl- 
CoA can be converted into 3-hydroxypropionyi-CoAX3-HP-CoA) by a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity (EC 4.2.1.-); and the resulting 3- 
HP-CoA can be converted into 3 -HP by a polypeptide having CoA transferase activity, a 
10 polypeptide having 3-hydroxypropionyl-CoA hydrolase activity (EC 3.12.-), or a 
polypeptide having 3 -hydroxyisobutryl-Co A hydrolase activity {EC 3.1.2.4). 

Polypeptides having CoA transferase activity as well as nucleic acid encoding 
such polypeptides can be obtained from various species including, without limitation, 
Megasphaera elsdenii, Clostridium propionicum, Clostridium kluyveri, and Escherichia 
15 colu For example, nucleic acid that encodes a polypeptide having CoA transferase 

activity can be obtained from Megasphaera elsdenii as described in Example 1 and can 
have a sequence as set forth in SEQ ID NO: 1. In addition, polypeptides having CoA 
transferase activity as well as nucleic acid encoding such polypeptides can be obtained as 
described herein. For example, the variations to SEQ ID NO: 1 provided herein can be 
20 used to encode a polypeptide having CoA transferase activity. 

Polypeptides (or the polypeptides of a multiple polypeptide complex such as an 
activated E2 a and E2 P complex) having lactyl-CoA dehydratase activity as well as 
nucleic acid encoding such polypeptides can be obtained from various species including, 
without limitation, Megasphaera elsdenii and Clostridium propionicum. For example, 
25 nucleic acid encoding an El activator, an E2 a subunit, and anE20 subunit that can form 
a multiple polypeptide complex having lactyMToA dehydratase activity can be obtained 
from Megasphaera elsdenii as described in Example 2. The nucleic acid encoding the El 
activator can contain a sequence as set forth in SEQ ID NO: 9; the nucleic acid encoding 
the E2 a subunit can contain a sequence as set forth in SEQ ID NO: 17; and the nucleic 
30 acid encoding the E2 J3 subunit can contain a sequence as set forth in SEQ ID NO: 25. In 
addition, polypeptides (or the polypeptides of a multiple polypeptide complex) having 
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lactyl-CoA dehydratase activity as well as nucleic acid encoding such polypeptides can be 
obtained as described herein. For example, the variations to SEQ ID NO: 9, 17, and 25 
provided herein can be used to encode the polypeptides of a multiple polypeptide 
complex having Co A transferase activity. 

5 Polypeptides having 3-hydroxypropionyl-CoA dehydratase activity as well as 

nucleic acid encoding such polypeptides can be obtained from various species including, 
— without limitations v Ghlorojlexus aurantiactis* Candida rugosa, Rhodosprittium rubrum, 
and Rhodobacter capsulates. For example, nucleic acid that encodes a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity can be obtained from Chloroflexus 

10 aurantiacus as described in Example 3 and can have a sequence as set forth in SEQ ID 
NO: 40. In addition, polypeptides having 3-hydroxypropionyl-CoA dehydratase activity 
as well as nucleic acid encoding such polypeptidescan be obtained as described herein. 
For example, the variations to SEQ ID NO: 40 provided herein can be used to encode a 
polypeptide having 3-hydroxypropionyl£oA.ddiy^tese.ac^vitv. 

1 5 Polypeptides having 3-hydroxypropionyl-CoA hydrolase activity as well as 

nucleic acid encoding such polypeptides can be obtained from various species including, 
without limitation, Candida rugosa. Polypeptides having 3-hydroxyisobutryl-CoA 
hydrolase activity as well as nucleic acid encoding such polypeptides can be obtained 
from various speciesLincluding^thort^ rattus, and 

20 homo sapiens. For example, nucleic acid that encodes a polypeptide having 3- 

hydroxyisobutryl-CoA hydrolase activity can be obtained from homo sapiens andean 
have a sequence as set forth in GenBank® accession number U66669. 

The term "polypeptide having enzymatic activity" as used herein refers to any 
polypeptide that catalyzes a chemical reaction of other substances without itself being 

25 destroyed or altered upon completion of the reaction. Typically, a polypeptide having 
enzymatic activity catalyzes the formation of one or more products from one or more 
substrates. Such polypeptides can have any type of enzymatic activity including, without 
limitation, the enzymatic activity or enzymatic activities associated with enzymes such as 
dehydratases/hydratases, 3-hydroxypropionyl-CoA dehydratases/hydratases, CoA 

30 transferases, lactyl-CoA dehydratases, 3-hydroxypropionyl-CoA hydrolases, 3- 
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10 



15 



20 



25 



hydroxyisobutryl-CoA hydrolases, poly hydroxyacid synthases, Co A synthetases, 
malonyl-CoA reductases, {J-alanine ammonia lyases, and lipases. 

As depicted in Figure 2, lactate can be converted into Iactyl-CoA by a polypeptide 
having CoA synthetase activity (EC 6.2.1.-); the resulting lactyl-Co A can be converted 
into acrylyl-CoA by a polypeptide (or multiple polypeptide complex) having iactyl-CoA 
dehydratase activity; the resulting acrylyl-CoA can be converted into 3-HP-CoA by a 
polypeptide having 3-hydroxypropionyl-CoA dehydratase activity; and the resulting 3- 
HP-CoA can be converted into polymerized 3-HP by a polypeptide having poly 
hydroxyacid synthase activity (EC 2.3.1.-). Polypeptides having CoA synthetase activity 
as well as nucleic acid encoding such polypeptides can be obtained from various species 
including, without limitation, Escherichia toli, Rhodobacter sphaeroides, Saccharomyces 
cervisiae, and Salmonella enterwa. For example, nucleic acid that encodes a polypeptide 
having CoA synthetase activity can be obtained from Escherichia coli and can have a 
sequence as set forth in GenBank® accession number U00006. Polypeptides^ multiple 
polypeptide complexes) haviicrflactyl-CoA dehydratase activity as well as nucleic acid 
encoding such polypeptides can be obtained as provided herein. Polypeptides having 3- 
hydroxypropionyl-CoA dehydratase activity as well as nucleic acid encoding such 
polypeptides also can be obtained as provided herein. Polypeptides having poly 
hydroxyacid synthase activity as well as nucleic acid encoding such polypeptides can be 
obtained from various species including, without limitation, Rhodobacter sphaeroides, 
Comamonas acidororans, Ralstonia eutropha, and Pseudomonas oleovorans. For 
example, nucleic acid that encodes a polypeptide having poly hydroxyacid synthase 
activity can be obtained from Rhodobacter sphaeroides and can have a sequence as set 
forth in GenBank® accession number X97200. 

As depicted in Figure 3, lactate can be converted into Iactyl-CoA by a polypeptide 
having CoA transferase activity; the resulting Iactyl-CoA can be converted into acrylyl- 
CoA by a polypeptide (or multiple polypeptide complex) having Iactyl-CoA dehydratase 
activity; the resulting aciylyl-CoA can be converted into 3-HP-CoA by a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity; the resulting 3-HP-CoA can be 
converted into 3-HP by a polypeptide having CoA transferase activity, a polypeptide 
having 3-hydroxypropionyl-CoA hydrolase activity, or a polypeptide having 3- 
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hydroxyisobutryi-CoA hydrolase activity; and the resulting 3-HP can be converted into an 
ester of 3-HP by a polypeptide having lipase activity (EC 3.1.1.-). Polypeptides having 
lipase activity as well as nucleic acid encoding such polypeptides can be obtained from 
various species including, without limitation, Candida rugosa, Candida tropicalis, and 
5 Candida albicans. For example, nucleic acid that encodes a polypeptide having lipase 
activity can be obtained from Candida rugosa and can have a sequence as set forth in 
CenBank® accession number A81 171 . 

As depicted in Figure 4, lactate can be converted into lactyl-CoA by a polypeptide 
having CoA synthetase activity; the resulting lactyl-Co A can be converted into acrylyl- 
10 CoA by a polypeptide (or multiple polypeptide complex) having lactyl-CoA dehydratase 
activity; and die resulting acrylyl-CoA can be converted into polymerized acrylate by a 
polypeptide having poly hydroxyacid synthase activity. 

As depicted in Figure 5, lactate can be converted into lactyl-CoA by a polypeptide 
having CoA transferase activity; the resulting lactyY-CoA can be converted into acrylyl- 
15 CoA by a polypeptide {or multiple polypeptide complex) having lactyl-CoA dehydratase 
activity; the resulting acrylyl-CoA can be converted into acrylate by a polypeptide having 
CoA transferase activity, and the resulting acrylate can be converted into an ester of 
acrylate by a polypeptide having lipase activity. 

As depicted in Figure 44, acetyl-CoA can be converted into malonyl-CoA by a 
20 polypeptide having acetyl-CoA carboxylase activity, and the resulting malonyl-CoA can 
be converted into 3-HP by a polypeptide having malonyl-CoA reductase activity. 
Polypeptides having acetyl-CoA carboxylase activity as well as nucleic acid encoding 
such polypeptides can be obtained from various species including, without limitation, 
Escherichia coli and Chloroflexus aurantiacus. For example, nucleic acid that encodes a 
25 polypeptide having acetyl-CoA carboxylase activity can be obtained from Escherichia 
coli and can have a sequence as set forth in GenBank® accession number M96394 or 
U 18997. Polypeptides having malonyl-CoA reductase activity as well as nucleic acid 
encoding such polypeptides can be obtained from various species including, without 
limitation, Chloroflexus aurantiacus, Sulfolobus metacillus, and Acidianus brierleyi. For 
30 example, nucleic acid that encodes a polypeptide having malonyl-CoA reductase activity 
can be obtained as described herein and can have a sequence similar to the sequence set 
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foith in SEQ ID NO: 140. In addition, polypeptides having malonyi-CoA reductase 
activity as well as nucleic acid encoding such polypeptides can be obtained as described 
herein. For example, the variations to SEQ ID NO: 140 provided herein cfcn be used to 
encode a polypeptide having malonyl-CoA reductase activity . 

5 Polypeptides having malonyl-CoA reductase activity can use NADPH as a co- 

factor. For example, the polypeptide having the amino acid sequence set forth in SEQ ED 
NO: 141 is a polypeptide having malonyl-CoA reductase activity that uses NADPH as a 
co-factor when converting malonyl-CoA into 3-HP. Likewise, polypeptides having 
malonyl-CoA reductase activity can use NADH as a co-factor. Such polypeptides can be • 

1 0 obtained by converting a polypeptide that has malonyl-Co A reductase activity and uses 
NADPH as a cofactor into a polypeptide that has malonyl-CoA reductase activity and 
uses NADH as a cofactor. Any method can be used to convert a polypeptide that uses 
NADPH as a cofactor into a polypeptide that uses NADH as a cofactor such as those 
described by others <Eppink et a/., J. Mol BioL, 292(l):87-96 (1999), Hall and Tomsett, 

15 -Microbiology, 1463*t-6): B99-406^O00), anA^obr-et alyProc. Natl Acad ScL, 

98(l):81-86 (2001))- For example, mutagenesis can be used to convert the polypeptide 
encoded by the nucleic acid sequence set forth in SEQ ID NO: 140 into a polypeptide 
that, when converting malonyl-CoA into 3-HP, uses NADH as a co-factor instead of 

„ NADPH. - 

20 As depicted in Figure 43, propionate can be converted into propionyl-CoA by a 

polypeptide having CoA synthetase activity such as the polypeptide having the sequence 
set forth in SEQ ID NO: 39; the resulting propionyl-CoA can be converted into acrylyl- 
_ CoA by a polypeptide having dehydrogenase activity such as the polypeptide having the 
sequence set forth in SEQ ID NO: 39; and the resulting acrylyl-CoA can be converted 
25 into <1) acrylate by apolypeptide having CoA transferase activity or CoA hydrolase 

activity, (2) 3 -HP-Co A by a polypeptide having 3-HP dehydratase activity (also referred 
to as acrylyl-Co A hydratase or simply hydratase)* such as the polypeptide having the 
sequence set forth in SEQ ID NO:39, or (3) polymerized acrylate by a polypeptide having 
poly hydroxyacid synthase activity. The resulting acrylate can be converted into an ester 
30 of acrylate by a polypeptide having lipase activity. The resulting 3-HP-CoA can be 

converted into <1) 3-HP by a polypeptide having CoA transferase activity, a polypeptide 
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having 3-hydroxypropionyl-CoA hydrolase activity-(EC 3.1 .2.-), or a polypeptide having 
3-hydroxyisobutyryl-CoA hydrolase activity (EC 3.1.2.4), or (2) polymerized 3-HP by a 
polypeptide having poly hydroxyacid synthase activity {EC 2.3.1 .-). 

As depicted in Figure 54, PEP can be converted into p-alanine. 0-alanine can be 

5 converted into p-alanyi-Co A through the use of a polypeptide having CoA transferase 
activity. fl-alanyl-CoA can then be converted into acrylyl-CoA through the use of a 
polypeptide having fi -alanyl-CoA ammonia lyase activity. Acrylyl-CoA can then be 
converted into 3-HP-CoA through the use of a polypeptide having 3-HP-CoA dehydratase 
activity, and a polypeptide having glutamate dehydrogenase activity can be used to 

10 convert 3-HP-CoA into 3-HP. 

As depicted in Figure 55, 3-HP can be made from jj-alanine by first contacting 0- 
alanine with a polypeptide having 4,4-aminobutyrate aminotransferase activity to create 
malbnate semialdehyde. The malonate semialdehyde can be converted into 3-HP with a 
polypeptide having 3-HP dehydrogenase activity or a polypeptide having 3- 

15 hydroxyisobutyrate dehydrogenase activity. 

DDL Nucleic acid molecules and polypeptides 

The invention provides isolated nucleic acid that contains the entire nucleic acid 
sequence^setibrthin SEQ IDNO:l, 9,12,25, 33,34, 36, 38, 40^42, 129,140, 142, 162, 

20 or~i63 . in addition; the4nvention providesnsolated nucleic aeid-thai contains a portion of 
the nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 1*63. For example, the invention provides isolated nucleic acid that 
contains a 15 nucleotide sequence-identical-to any 15 nucleotide sequence set forth in 
SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163 including, 

25 without limitation, the sequence starting at nucleotide number 1 and ending at nucleotide 
number 15, the sequence starting at nucleotide number 2 and ending at nucleotide number 
16, the sequence starting at nucleotide number 3 and ending at nucleotide number 17, and 
so forth. It will be appreciated that the invention also provides isolated nucleic acid that 
contains a nucleotide sequence that is greater than 15 nucleotides (e,g,, 16, 17, 18, 19, 20, 

30 21, 22, 23, 24, 25,26, 27, 28, 29, 30, or more nucleotides) in length and identical to any 
portion of the sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
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140, 142, 162, or 163. For example, the invention provides isolated nucleic acid that 
contains a 25 nucleotide sequence identical to any 25 nucleotide sequence set forth in 
SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163 including, 
without limitation, the sequence starting at nucleotide number 1 and ending at nucleotide 
5 number 25, the sequence starting at nucleotide number 2 and ending at nucleotide number 
26, the sequence starting at nucleotide number 3 and ending at nucleotide number 27, and 
so forth. Additional examples include, without limitation, isolated nucleic acids that 
" corHama nucledtiaesequence that is SCfof more imetedtides (erg:; 100, 150, 200, 250, 
300, or more nucleotides) in length and identical to any portion of the sequence set forth 
10 in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 1*3. Such 
isolated nucleic acids can include, without limitation, those isolated nucleic acids 
containing a nucleic acid sequence represented in a single line of sequence depicted in 
^ Flg^ 6; i 0, H 

depicted in these figures, with the possible exception of the last line, provides a 
1 5 — nucleotide~sequ ence co ntaining^afieast 50-bases: 

In addition, the invention provides isolated nucleic acid that contains a variation 
of the nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 
129, 140, 142, 162, or 163. For example, the invention provides isolated nucleic acid 
containing a nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 
20 40; 42,129rl4fr, 442^16 2, oi 163 thatcontainra-sin^e-insertion^ a single deletion, a 
" single substitution,- multiple insertions, multiple deletions, multiple substitutions, or any 
combination thereof (e.g., single deletion together with multiple insertions). Such 
isolated nucleic acid molecules can share at least 60, 65, 70, 75, 80, 85, 90, 95, 97, 98, or 
99 percent sequence identity with a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 
25 36, 38, 40, 42, 129, 140, 142, 162, or 163. 

The invention provides multiple examples of isolated nucleic acid that contains a 
variation of a nucleic acid sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 
40, 42, 129, 140, 142, 162, or 163. For example, Figure 8 provides the sequence set forth 
in SEQ ID NO:l aligned with three other nucleic acid sequences. Examples of variations 
30 of the sequence set forth in SEQ ID NO:l include, without limitation, any variation of the 
sequence set forth in SEQ ID NO:l provided in Figure 8. Such variations are provided in 
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Figure 8 in that a comparison of the nucleotide <or lack thereof) at a particular position of 
the sequence set forth in SEQ ID NO:l with the nucleotide (or lack thereof) at the same 
aligned position of any of the other three nucleic acid sequences depicted in Figure 8 (i.e., 
SEQ ID NOs:3, 4, and 5) provides a list of specific changes for the sequence set forth in 
5 SEQIDNO:!. For example, the "a" at position 49 of SEQ ID NO:l can be substituted 
with an M c" as indicated in Figure 8. As also indicated in Figure 8, the "a" at position 590 
of SEQ ID NO: 1 can be substituted with a "atgg"; an "aaac* 1 can be inserted before the 
"g" at position 393 of SEQ ID NO:l ; or the "gaa" at position 736 of SEQ ID NO: 1 can b? 
deleted. It will be appreciated that the sequence set forth in SEQ ID NO: 1 can contain 
10 any number of variations as well as any combination of types of variations. For example, 
the sequence set forth in SEQ ID NO:l can contain one variation provided in Figure 8 or 
more than one (e.g., 2, 3, 4, 5 r 6, 7,8,9,10, 15i 20,25, 50, 100, or more) of the variations 
provided in Figure 8. It is noted that the nucleic acid sequences provided by Figure 8 can 
encodejpoiypgptides havin&CpA transferase activity. The_invention also provides 
15 isolated nucleic acid that contains a variant of apprtion of the sequence set forth in SEQ 
ID NO: 1 as depicted in Figure 8 and described herein. 

Likewise, Figure 12 provides variations of SEQ ID NO:9 and portions thereof; 
Figure 16 provides variations of SEQ ID NO: 17 and portions thereof; Figure 20 provides 
variationsMSEQ ID.NO:25 andjpptfions thereof, FiguKLl2 pjpyid^ variations of SEQ 
20 ID NO:40 and portions thereof; and Figure 53 provides variations of SEQ ID NO:140. 

The invention provides isolated nucleic acid that contains a nucleic acid sequence 
that encodes the entire amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 
39, 41, 141, 160, or 161. In addition, the invention provides isolated nucleic acid that 
contains a nucleic acid sequence that encodes a portion of the amino acid sequence set 
25 forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For example, the 

invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes 
a 15 amino acid sequence identical to any 15 amino acid sequence set forth in SEQ ID 
NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161 including, without limitation, the 
sequence starting at amino acid residue number 1 and ending at amino acid residue 
30 number 15, the sequence starting at amino acid residue number 2 and ending at amino 
acid residue number 16, the sequence starting at amino acid residue number 3 and ending 
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at amino acid residue number 17, and so forth. It will be appreciated that the invention 
also provides isolated nucleic acid that contains a nucleic acid sequence that encodes an 
amino acid sequence that is greater than 15 amino acid residues (e.g., 16, 17, 18, 19, 20, 
2 1 , 22, 23, 24, 25, 26, 27, 28, 29, 30, or more amino acid residues) in length and identical 
5 to any portion of the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 
160, or 161 . For example, the invention provides isolated nucleic acid that contains a 
nucleic acid sequence that encodes a 25 amino acid sequence identical to any 25 amino 
acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161 
including, without limitation, the sequence starting at amino acid residue number 1 and 
10 ending at amino acid residue number 25, the sequence starting at amino acid residue 
number 2 and ending at amino acid residue number 26, the sequence starting at amino 
acid residue number 3 and ending at amino acid residue number 27, and so forth. 
Additional examples include, without limitation, isolated nucleic acids that contain a 
nucleic acid sequence that encodes an amino acid sequence that is 50 or more amino acid 
15 residues (e.g., 100, 150, 200, 250, 300, or more amino acid residues) in length and 

identical to any portion of the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 
41, 141, 160, or 161. Such isolated nucleic acids can include, without limitation, those 
isolated nucleic acids containing a nucleic acid sequence that encodes an amino acid 
sequence represented in a single line of sequence depicted in Figure 7, 1 1, 15, 19, 24, 26, 
20 28, 30, or 50 since each line of sequence depicted in these figures, with the possible 

exception of the last line, provides an amino acid sequence containing at least 50 residues. 

In addition, the invention provides isolated nucleic acid that contains a nucleic 
acid sequence that encodes an amino acid sequence having a variation of the amino acid 
sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For 
25 example, the invention provides isolated nucleic acid containing a nucleic acid sequence 
encoding an amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 
141, 160, or 161 that contains a single insertion, a single deletion, a single substitution, 
multiple insertions, multiple deletions, multiple substitutions, or any combination thereof 
(e.g., single deletion together with multiple insertions). Such isolated nucleic acid 
30 molecules can contain a nucleic acid sequence encoding an amino acid sequence that 

shares at least 60, 65, 70, 75, 80, 85, 90, 95, 97, 98, or 99 percent sequence identity with a 
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sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. 

The invention provides multiple examples of isolated nucleic acid containing a 
nucleic acid sequence encoding an amino acid sequence having a variation of an amino 
acidsequencesetforthinSEQIDNO:2,10,18,26,35,37,39,41,141,160,orl61. For 

5 example, Figure 9 provides the amino acid sequence set forth in SEQ ID NO:2 aligned 
with three other amino acid sequences. Examples of variations of the sequence set forth 
in SEQ ID NO:2 include, without limitation, any variation of the sequence set form in 
SEQ ID NO:2 provided in Figure 9. Such variations are provided in Figure 9 in mat a 
comparison of the amino acid residue (or lack thereof) at a particular position of the 

10 sequence set forth in SEQ ID NO:2 with the amino acid residue (or lack thereof) at the 
same aligned position of any of the other three amino acid sequences of Figure 9 (i.e., 
SEQ ID NGs:6,"7, and 8) provides a list of specific changes for the sequence set forth in 
SEQ ID NO:2. For example, the "k" at position 17-of SEQ ID NO:2 can be substituted 
wit h a "p" o r "h" as indicated in Fi gur e 9 . As a l so m dicatedjnJEigyBeAae "y" at 

15 position 125 of SEQ ID NO:2 can be substituted with an 1" or "f . It will be appreciated 
that the sequence set forth in SEQ ID NO:2 can contain any number of variations as well 
as any combination of types of variations. For example, the sequence set forth in SEQ ID 
NO:2 can contain one variation provided in Figure 9 or more than one (e.g., 2, 3, 4, 5, 6, 
7 ft o in i s on, 25, S O, 1 Q0 T nr more) of the variations ^ntQAdd^in^ieure^., It is noted 

20 that me amino acid sequences provided in Figure .9 can be polypeptides having CoA 

transferase activity. 

The invention also provides isolated nucleic acid containing a nucleic acid 
sequence encoding an amino acid sequence mat contains a variant of a portion of the 
sequence set forth in SEQ ID NO:2 as depicted in Figure 9 and described herein. 

25 Likewise, Figure 1 3 provides variations of SEQ ID NO: 1 0 and portions thereof; 

Figure 17 provides variations of SEQ ID NO:18 and portions thereof; Figure 21 provides 
variations of SEQ ID NO:26 and portions thereof; Figure 33 provides variations of SEQ 
ID NO:41 and portions thereof; Figures 40, 41, and 42 provide variations of SEQ ID 
NO:39; and Figure 52 provides variations of SEQ ID NO:141 and portions thereof. 

30 It is noted that codon preferences and codon usage tables for a particular species 

can be used to engineer isolated nucleic acid molecules that take advantage of thet»don 
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usage preferences of that particular species. For example, the isolated nucleic acid 
provided herein can be designed to have codons mat are preferentially used by a 
particular organism of interest 

The invention also provides isolated nucleic acid that is at least about 12 bases in 
5 length<e,g., at feast about 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 60, 100, 250, 500, 
750, 1000, 1500, 2000, 3000, 4000, or 5000 bases in length) and hybridizes, under 
hybridization conditions, to the sense or antisense strand of a nucleic acid having the 
sequence set form in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, 
or 163. The hybridization conditions can be moderately or highly stringent hybridization 
10 conditions. 

The invention provides polypeptides that contain the entire amino acid sequence 
set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. In addition, the 
invention provides polypeptides that contain a portion of the amino acid sequence set 
forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For example, the 
15 invention provides polypeptides that contain a 15 amino acid sequence identical to any 15 
amino acid sequence set forth inSEQ ID NO:2, 10, 18, 26, 3S, 37, 39, 41, 141, 160, or 
161 including, without limitation, the sequence starting at amino acid residue number 1 
and ending at amino acid residue number 15, the sequence starting at amino acid residue 
number 2 and ending at amino acid residue number 16, the sequence starting at amino 
20 acid residue number 3 and ending at amino acid residue number 17, and so forth. It will 
be appreciated that the invention also provides polypeptides that contain an amino acid 
sequence that is greater ton 15 amino acid residues (e.g., 16, 17, 18, 19, 20, 21, 22, 23, 
24, 25, 26, 27, 28, 29, 30, or more amino acid residues) in length and identical to any 
portion of the sequence set forth in SEQ IDNO-.2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 

25 161 . For example, the invention provides polypeptides that contain a 25 amino acid 

sequence identical to any 25 amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 
35, 37, 39, 41, 141, 160, or 161 including, without limitation, the sequence starting at 
amino acid residue number 1 and ending at amino acid residue number 25, the sequence 
starting at amino acid residue number 2 and ending at amino acid residue number 26, the 

30 sequence starting at amino acid residue number 3 and ending at amino acid residue 

number 27, and so forth. Additional examples include, without limitation, polypeptides 



33 



BNSOOC1D: <WO 0Z4241BA2JA> 



WO 02/042418 



PCTAJS01/43607 



that contain an amino acid sequence that is 50 or more amino acid residues (e.g., 100, 
150, 200, 250, 300, or more amino acid residues) in length and identical to .any portion of 
the sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. Such 
polypeptides can include, without limitation, those polypeptides containing a amino acid 
5 sequence represented in a single line of sequence depicted in Figure 7, 1 1, 15, 19, 24, 26, 
28, 30, or 50 since each line of sequence depicted in these figures, with the possible 
exception of the last line, provides an amino acid sequence containing at least 50 residues. 

In addition, the invention provides polypeptides that an amino acid sequence 
having a variation of the amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 
10 37, 39, 41, 141, 160, or 161. For example, the invention provides polypeptides 

containing an amino acid sequence -set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 
141, 160, or 161 that contains a single insertion, a single deletion, a single substitution, 
multiple insertions, multiple deletions, midtiple substitutions, or any combination thereof 
(e.g., single deletion together with multiple insertions). Such polypeptides can contain an 
15 amino acid sequence that shares at least 60, 65, 70, 75, 80, 85, 90, 95, 97, 98, or 99 

percent sequence identity with a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 
39,41,141, 160, or 161. 

The invention provides multiple examples of polypeptides containing an amino 
acid sequence having a variation of an amino acid sequence set forth in SEQ ID NO:2, 
20 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161. For example, Figure 9 provides the amino 
acid sequence set forth in SEQ ID NO:2 aligned with three other amino acid sequences. 
Examples of variations of the sequence set forth in SEQ ID NO :2 include, without 
limitation, any variation of the sequence set forth in SEQ ID NO:2 provided in Figure 9. 
Such variations are provided in Figure 9 in that a comparison of the amino acid residue 
25 (or lack thereof) at a particular position of the sequence set forth in SEQ ID NO:2 with 
the amino acid residue (or lack thereof) at the same aligned position of any of the other 
three amino acid sequences of Figure 9 (i.e., SEQ ID NOs:6, 7, and 8) provides a list of 
specific changes for the sequence setiforth in SEQ ID NO:2. For example, the "k" at 
position 17 of SEQ ID NO:2 can be substituted with a "p" or "h" as indicated in Figure 9. 
30 As also indicated in Figure 9, the V at position 125 of SEQ ID NO:2 can be substituted 
with an "i" or M f\ It will be appreciated that the sequence set forth in SEQ ID NO:2 can 
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contain any number of variations as well as any combination of types of variations. For 
example, the sequence set forth in SEQ ID NO:2 cancontain one variation provided in 
Figure 9 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of 
the variations provided in Figure 9. It is noted that the amino acid sequences provided in 
5 Figure 9 can be polypeptides having CoA transferase activity. 

The inventionalso provides polypeptides containing an arnino acid sequence that 
contains a variant of a portion of the sequence set forth in SEQ ID NO:2 as depicted in 
Figure 9 and described herein. 

Likewise, Figure 13 provides variations of SEQ ID NO:10 and portions thereof; 
10 Figure 17 provides variations of SEQ ID NO:18 and portions thereof; Figure 21 provides 
variations of SEQ ID NO:26 and portions thereof; Figure 33 provides variations of SEQ 
ID NO:41 and portions thereof, Figures 40, 41 , and 42 provide variations of SEQ ID 
NO:39; and Figure 52 ^ and portions thereof. 

Polypeptides having a variant amino acid sequence can retain enzymatic activity. 
15 Suegporyp^nWcfflTbe pro duced by m^plflanllgThe-nucleotide'sequence encoding a 
polypeptide using standard procedures such as site-directed mutagenesis or PCR. One 
type of modification includes tile substitution of one or more amino acid residues for 
amino acid residues having a similar biochemical property. For example, a polypeptide 
can have an amino acid sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 
20 1417160, or 161 with one or more conservative substitutions. ~ 

More substantial changes can be obtained by selecting substitutions that are less 
conservative than those in Table 1, i.e., selecting residues that differ more significantly in 
meireffect on maintairungr^ in the area of the 

substitution, for example, as a sheet or helical conformation; (b) the charge or 
25 hydrophobicity of the polypeptide at the target site; or<c) the bulk of the side chain. The 
substitutions that in general are expected to produce the greatest changes in polypeptide 
function are those in which: (a) a hydropbilic residue, e.g., serine or threonine, is 
substituted for (or by) a hydrophobic residue, «:g., leucine, isoleucine, phenylalanine, 
valine or alanine; (b) a cysteine or proline is substituted for (or by) any other residue; (c) 
30 a residue having an electropositive side chain, e.g., lysine, arginine, or histidine, is 

substituted for (or by) an electronegative residue, e.g., glutamic acid or aspartic acid; or 
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Xd) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) one 
not having a side chain, e.g., glycine. The effects of these amino acid substitutions (or 
other deletions or additions) can be assessed for.polypeptides having enzymatic activity 
by analyzing the ability of the polypeptide tooatalyze the conversion of the same * 

S substrate as the related native polypeptide to the same product as the related native 

polypeptide. Accordingly, polypeptides having 5, 10, 20, 30, 40, 50 or less conservative 
substitutions are provided by the invention. 

Polypeptides and nucleic acid encoding polypeptide can be produced by standard 
DNA mutagenesis techniques, for example, Ml 3 primer mutagenesis. Details of these 

1 0 techniques are provided in Sambrook et al. <ed.), Molecular Cloning: A Laboratory 
Manual 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring, Harbor, 
N.Y., 1989, Ch. 15. Nucleic acid molecules can contain changes of a coding region to fit 
the codon usage bias of the particular organism into which the molecule is to be 
introduced, 

1 5 Alternatively, the coding region can be altered by taking advantage of the 

degeneracy of the genetic code to alter the coding sequence in such a way that, while the 
nucleic acid sequence is substantially altered, it nevertheless encodes a polypeptide 
having an amino acid sequence identical or substantially similar to the native amino acid 
sequence. For example, the ninth amino acid residue of die sequence set forth in SEQ ID 

20 NO: 2 is alanine, which is encoded in the open reading frame by the nucleotide codon 

triplet GCT. Because of the degeneracy of the genetic code, three other nucleotide codon 
triplets-GCA, GCC, and GCG -also code for alanine. Thus, the nucleic acid sequence 
of the open reading frame can be changed at this position to any of these three codons 
without affecting the amino acid sequence of the encoded polypeptide or the 

25 characteristics of the polypeptide. Based upon the degeneracy of the genetic code, 

nucleic acid variants can be derived from a nucleic acid sequence disclosed herein using a 
standard DNA mutagenesis techniques as described herein, or by synthesis of nucleic acid 
sequences. Thus, this invention also encompasses nucleic acid molecules that encode the 
same polypeptide but vary in nucleic acid sequence by virtue of the degeneracy of the 

30 genetic code. 
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IV. Methods of Making 3-HP and Other Organic Acids 

Each step provided in the pathways depicted in Figures 1-5, 43-44, 54, and 55 can 
be performed within a cell {in vivo) or outside a cell {in vitro, e.g., in a container or 
column). Additionally, the organic acid products can be generated through a combination 
5 of in vivo synthesis and in vitro synthesis. Moreover, the in vitro synthesis step, or steps, 
can be via chemical reaction or enzymatic reaction. 

For example, a microorganism provided herein can be used to perform the steps 
provided in Figure 1, or an extract containing polypeptides having the indicated 
enzymatic activities can be used to perform the steps provided in Figure 1 . In addition, 
10 chemical treatments can be used to perform the conversions provided in Figures 1-5, 43- 
44, 54, and 55. For example, acrylyl-CoA can be converted into acrylate by hydrolysis. 
Other chemical treatments include, without limitation, trans esterification to convert 
acrylate into an acrylate ester. 

Carbon sources suitable as starting points for bipconversion include carbohydrates 
1 5 and synthetic intermediates. Examples of carbohydrates which cells are capable of 

metabolizing to pyruvate include sugars such as dextrose, triglycerides, and fatty acids. 

Additionally, intermediate chemical products can be starting points. For example, 
acetic acid and carbon dioxide can be introduced into a fermentation broth. Acetyl-CoA, 
. malonyi-CoA, and 3-HP can be sequentially produced using a polypeptide having CoA 
20 synthase activity, a polypeptide having acetyl-CoA carboxylase activity, and a 

polypeptide having malonyl-CoA reductase activity. Other useful intermediate chemical 
starting points can include propionic acid, acrylic acid, lactic acid, pyruvic acid, and (J- 
alanine. 

25 A. Expression of Polypeptides 

The polypeptides described herein can be produced individually in a host -cell or in 
combination in a host cell. Moreover, the polypeptides having a particular enzymatic 
activity can be a polypeptide that is either naturally-occurring or non-naturally-CKxurring. 
A naturally-occurring polypeptide is any polypeptide having an amino acid sequence as 
30 found in nature, including wild-type and polymorphic polypeptides. Such naturally- 
occurring polypeptides can be obtained from any species including, without limitation, 



37 



BNSOOOO: <WO___0242418A*JA> 



WO 02/042418 



PCT/US01M3607 



animal <e.g., mammalian), plant, fungal, and bacterial species. A non-naturaUy-^urring 
polypeptide is any polypeptide having an amino acid sequence that is not found in nature. 
Thus, a non-naturally-occurring polypeptide can be a mutated version of a naturally- 
occurring polypeptide, or an engineered polypeptide. For example, a non-naturaUy- 
5 occurrmg polypeptide having 3-hydroxypropionyl-CoA dehydratase activity can be a 
mutated version of a naturaUy-occurring polypeptide having 3-hydroxypropionyl-CoA 
dehydratase activity that retains at least some 3-hydroxypropi6nyl-CoA dehydratase 
activity. A polypeptide can be mutated by, for example, sequence additions, deletions, 
substitutions, or combinations thereof. 
10 The invention provides genetically modified cells that can be used to perform one 

or more steps of the steps in the metabolic pathways described herein or the genetically 
modified cellscan oe useiiWpro^^ for subsequent use in 

vitro. For example, an individual microorganism-can contain exogenous nucleic acid 
such mat each of the polypeptides necessary to perform the steps depicted in Figures 1, 2, 
15 3, 4, 5, 43, 44, 54, or 55 are expressed, It is important to note that such cells can contain 
any number of exogenous nucleic acid molecules. For example, a particular cell can 
contain six exogenous nucleic acid molecules with each one encoding one of the six 
polypeptides necessary to convert lactate into 3-HP as depicted in Figure 1, or a particular 
cell can endogenously produce polypeptides necessary to convert lactate into acrylyl-CoA 
20 while containing exogenous nucleic acid that encodes polypeptides necessary to convert 
acrylyl-CoA into 3-HP. 

In addition, a single exogenous nucleic acid molecule can encode one or more 
than one polypeptide. For example, a single exogenous nucleic acid molecule can contain 
sequences that encode three different polypeptides. Further, the cells described herein 
25 can contain a single copy, or multiple copies (e.g., about 5, 10, 20, 35, 50, 75, 100 or 150 
copies), of a particular exogenous nucleic acid molecule. For example, a particular cell 
can contain about 50 copies of the constructs depicted in Figure 34, 35, 36, 37, 38, or 45. 
Again, the cells described herein cancontain more than one particular exogenous nucleic 
acid molecule. For example, a particular cell cancontain about150 copies of exogenous 
30 nucleic acid molecule X as well as about 75 copies of exogenous nucleic acid molecule 
Y. 
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In another embodiment, a cell within the scope of the invention can contain an 
exogenous nucleic acid molecule that encodes a polypeptide having 3-hydroxypropionyl- 
CoA dehydratase activity. Such cells can have any level of 3-hydroxypropionyl-CoA 
dehydratase activity. For example, a cell containing an exogenous nucleic acid molecule 
5 that encodes a polypeptide having 3-hydroxypropionyl-CoA dehydratase activity can 
have 3-hydroxypropionyl-CoA dehydratase activity with a specific activity greater than 
about 1 mg 3-HP-CoA formed per gram dry cell weight per hour (e.g., greater than about 
10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400,500, or more 
mg 3-HP-CoA formed per gram dry cell weight per hour). Alternatively, a cell can have 
10 3-hydroxypropionyl-CoA dehydratase activity such that a cell extract from lxlO 6 cells 
has a specific activity greater than about 1 ug 3-HP-CoA formed per mg total protein per 
10 minutes<e.g., greater than about 10, 20, 30, 40, SO, 60, 70, 80, 90, 100, 125, 150, 200, 
250, 300, 350, 400, 500, or more ug 3-HP-CoA formed per mg total protein per 10 

minutes). * 

15 - A nucleic acid molecwe encoaih]^ activity can be 

identified and obtained using any method such as those described herein. For example, 
nucleic acid molecules mat encode a polypeptide having enzymatic activity can be 
identified and obtained using common molecular cloning or chemical nucleic acid 
•synthesis procedures and teehmquesj-mcluding PCR. In addition, standard nucleic acid 

20 sequencing techniques and software programs that translate nucleic acid sequences into 
amino acid sequences based on the genetic code can be used to determine whether or not 
a particular nucleic acid has any sequence homology with known enzymatic polypeptides. 
Sequence alignment software such as MEGAOGN® (DNASTAR; Madison, WI, 1 997) 
can be used to compare various sequences. In addition, nucleic acid molecules encoding 

25 known enzymatic polypeptides can be mutated using common molecular cloning 
techniques (e.g., site-directed mutagenesis). Possible mutations include, without 
limitation, deletions, insertions, and base substitutions, as well as combinations of 
deletions, insertions, and base substitutions. Further, nucleic acid and amino acid 
databases <e.g., OenBank®) can be used to identify a nucleic acid sequence that encodes a 

30 polypeptide having enzymatic activity. Briefly, any amino acid sequence having some 
homology to a polypeptide having enzymatic activity, or any nucleic acid sequence 
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having some homology to a sequence encoding a polypeptide having enzymatic activity 
can be used as a query to search GenBank®. The identified polypeptides then can be 
analyzed to determine whether or not they exhibit enzymatic activity. 

In addition, nucleic acid hybridization techniques can be used to identify and 
5 obtain a nucleic acid molecule that encodes a polypeptide having enzymatic activity. 
Briefly, any nucleic acid molecule that encodes a known enzymatic polypeptide, or 
fragment thereof, can be used as a probe to identify a similar nucleic acid molecules by 
hybridization under conditions of moderate to high stringency. Such similar nucleic acid 
molecules then can be isolated, sequenced, and analyzed to determine whether the 
10 encoded polypeptide has enzymatic activity. 

Expression cloning techniques also can be used to identify and obtain a nucleic 
acid mdlecuie that encodes a pbiyj^p^ a 
substrate known to interact with a particular enzymatic polypeptide can be used to screen 
-a-phage display library containing that enzymatic polypeptide. Phage display libraries 
15 can be generated as described elsewhere (Burritt et al, Anal Biochem. 238:1-13 (1990)), 
or can be obtained from commercial suppliers such as Novagen (Madison, WI). 

Further, polypeptide sequencing techniques can be used to identify and obtain a 
nucleic acid molecule that encodes a polypeptide having enzymatic activity. For 
— < example, a^urifiedfol^e ptide can be separated by gel electroph opesisrand4ts amino 
20 acid sequence determined by, for example, amino acid microsequencing techniques. 
Once determined, the amino acid sequence can be used to design degenerate 
oligonucleotide primers. Degenerate oligonucleotide primers can be used to obtain the 
nucleic acid encoding the polypeptide by PCR. Once obtained, the nucleic acid can be 
sequenced, cloned into an appropriate expression vector, and introduced into a 
25 microorganism. 

Any method can be used to introduce an exogenous nucleic acid molecule into a 
cell. In fact, many methods for introducing nucleic acid into microorganisms such as 
bacteria and yeast are well known to those skilled in the art For example, heat shock, 
lipofection, electroporation, conjugation, fusion of protoplasts, and biolistic delivery are 
30 common methods for introducing nucleic acid into bacteria and yeast cells. See, ^e.g., fto 
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et al., J. Bacterol. 153:1«3-168<1983); Dunens et al, Curr. Genet. 18:7-12 (1990); and 
Becker and Guarente, Methods in Enzymology 194:182-187 (1991). 

An exogenous nucleic acid molecule contained within a particular cell of the 
invention can be maintained within that cell in any form. For example, exogenous 

5 nucleic acid molecules can be integrated into the genome of the cell or maintained in an 
■ episomal state. In other words, a cell of the invention can be a stable or transient 
transformant Again, a microorganism described herein can contain a single copy, or 
multiple copies (e.g., about 5, 10, 20, 35, 50, 75, 100 or 150 copies), of a particular 
exogenous nucleic acid molecule as described herein. 

1 0 Methods for expressing an amino acid sequence from an exogenous nucleic acid 

molecule are well known to those skilled in the ait Such methods include, without 
limitation, constructing a nucleic acid such that a regulatory element promotes tbe 
expression of a nucleic acid sequence that encodes-a polypeptide. Typically, regulatory 
elements are DNA sequences that regulate the expression of other DNA sequences at the 

15 level of transcription. Thus, regulatory elements include, without limitation, promoters, 
enhancers, and the like. Any type of promoter can be used to express an amino acid 
sequence from an exogenous nucleic acid molecule. Examples of promoters include, 
without limitation, constitutive promoters, tissue-specific promoters, and promoters 
responsive or unresponsive to a particular stimulus~(e.g~ H chemical 

20 concentration, and the like). Moreover, methods for expressing a polypeptide from an 
exogenous nucleic acid molecule in cells such as bacterial cells and yeast cells are well 
known to those skilled in the art For example, nucleic acid constructs that are capable of 
expressing exogenous polypeptides within E. colt are well known. See, e.g., Sambrook et 
al. 9 Molecular cloning: a laboratory manual, Cold Spring Harbour Laboratory Press, New 

25 Yoik, USA, second edition (1989). 

B. Production of Organic Acids and Related Products via Host Cells 

The nucleic acid and amino acid sequences provided herein can be used with cells 
to produce 3 -HP and/or other organic compounds such as 1,3-propanediol, acrylic acid, 
30 polymerized acrylate, esters of acrylate, esters of 3-HP, and polymerized 3-HP. Such 

cells can be from any species including those listed within the taxonomy web pages at the 
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National Institute of Health sponsored by the United States government 
(www.ncbi.nlm.nih.gov). The cells can be eukaryotic or prokaryotic. For example, 
genetically modified cells of the invention can be mammalian cells (e.g., human, marine, 
and bovine cells), plant cells{e.g., corn, wheat, rice, and soybean cells), fungal cells (e.g., 

5 Aspergillus and Rhizopus cells), yeast cells, or bacterial cells (e.g., Lactobacillus, 

Lactococcus, Bacillus, Escherichia, and Clostridium cells). A cell of the invention also 
can be a microorganism The term "microorganism" as used herein refers to any 
microscopic organism including, without limitation, bacteria, algae, fungi, and protozoa. 
Thus, E. coli, S. cerevisiae, Ktuveromyces lactis, Candida blankii, Candida rugosa, and 

10 Pichia postoris are considered microorganisms and can be used as described herein. 
Typically, a cell of the invention is genetically modified such that a particular 
organic compound is produced. In one embodiment, the invention provides rails that 
make 3 -HP from PEP. Examples biosynthetic pathways that cay be used by cells to make 
3-HP are shown in Figures 1-5, 43-44, 54, and 55. ... . 

1 5 Generally, cells that are genetically modified to synthesize a particular organic 

compound contain one or more exogenous nucleic acid molecules that encode 
polypeptides having specific enzymatic activities. For example, a microorganism can 
contain exogenous nucleic acid that encodes a polypeptide having 3 -hydroxypropionyl- 
CoA dehydratase activity. In this case, acrylylrCoA can be converted into 3- 

20 hydroxypropionic acid-CoA which can lead to the production of 3-HP. It is noted that a 
cell can be given an exogenous nucleic acid molecule that encodes a polypeptide having 
an enzymatic activity that catalyzes the production of a compound not normally produced 
by that cell. Alternatively, a cell can be given an exogenous nucleic acid molecule that 
encodes a polypeptide having an enzymatic activity that catalyzes the production of a 

25 compound that is normally produced by that cell. In this case, the genetically modified 
cell can produce more of the compound, or can produce the compound more efficiently, 
than a similar cell not having the genetic modification. 

In one embodiment, the invention provides a cell containing an exogenous nucleic 
acid molecule that encodes a polypeptide having enzymatic activity that leads to the t 

30 formation of 3-HP. It is noted that the produced 3-HP can be secreted from the cell, 
eliminating the need to disrupt cell membranes to retrieve the organic compound. 



42 



WO 02/042411 



PCTYUS01/43607 



Typically, the cell of the invention produces 3-HP >yith the concentration being at least 
about 100 mg per L (e.g., at least about 1 g/L, 5 g/L, 10^/L, 25 g/L,^50 g/L, 75 g/L, 80 
g/L, 90 g/L, 100 g/L, or 120 ^/L). When determining the yield of an organic compound 
such as 3-HP for a particular cell, any method can be used, See, e.g., Applied 

5 Environmental Microbiology 59(12):4261-4265 (1993). Typically, a cell within the scope 
of the invention such as a microorganism catabolizes a hexose carbon source such as 
glucose. A<;ell, however, can catabolize a variety of carbon sources such as pentose 
sugars (e.g., ribose, arabinose, xylose, and lyxose), fatty acids, acetate, or glycerols. In 
other words, a cell within the scope of the invention can utilize a variety of carbon . 

10 sources. 

As described herein, a cell within the scope of the invention can contain an 
exogenous nucleic acid molecule that encodes a polypeptide having enzymatic activity 
that leads to the formation of 3-HP or other organic compounds such as 1 ,3 -propanediol, 
acrylic acid, poly-acrylate, acrylate-esters, 3-HP-esters r and poly-3-HP. Methods of 

1 5 identifying cells that contain exogenous nucleic acid are well known to those skilled in 
the art Such methods include, without limitation, PCR and nucleic acid hybridization 
techniques such as Northern and Southern analysis (see hybridization described herein). 
In some cases, immunohisto-chemistry and biochemical techniques can be used to 
determme-ifa-cell-contains-a ^partioxlarnudeic^idby detecting^e-expression of the 

20 polypeptide encoded Irf fhtf For example, an antibody 

having specificity for a polypeptide can be used to determine whether or not a particular 
cell contains nucleic acid encoding that polypeptide. Further, biochemical techniques can 
be usSdiff deterffiir^ a 
polypeptide having enzymatic activity by detecting an organic product produced as a 

25 result of the expression of the polypeptide having enzymatic activity. For example, 

detection of 3-HP after introduction of exogenous nucleic acid that encodes a polypeptide 
having 3-hydroxypropionyl-CoA dehydratase activity into a cell that does not normally 
express such a polypeptide can indicate that that cell not only contains the introduced 
exogenous nucleic acid molecule but also expresses the encoded polypeptide from that 

30 introduced exogenous nucleic acid molecule. Methods for detecting specific enzymatic 
activities or the presence of particular organic products are well known to those skilled in 
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the art. For example, the presence of an organic compound such as 3-HP can be 
determined as described elsewhere. See, Sullivan and Clarke, J. Assoc. Offit Agr. 
Chemists, 38:514-518 (1955). . . 

5 C. Cells with Reduced Polypeptide Activity 

The invention also provides genetically modified cells having reduced polypeptide 
activity. The term "reduced" as used herein with respect to a cell and a particular 
polypeptide's activity refers to a lower level of activity than that measured in a 
comparable cell of the same species. For example, a particular microorganism lacking 

10 enzymatic activity X is considered to have reduced enzymatic activity X if a comparable 
microorganism has at least some enzymatic activity X It is noted that a cell can have the 
activity of any type of polypeptide reduced including, without limitation, enzymes, 
transcription factors, transporters, receptors, signal molecules, and the like. For example, 
a cell can contain an exogenous nucleic acid molecule mat disrupts a regulatory and/or 

1 5 coding sequence of a polypeptide having pyruvate decarboxylase activity or alcohol 
dehydrogenase activity. Disrupting pyruvate decarboxylase and/or alcohol 
dehydrogenase expression can lead to the accumulation of lactate as well as products 
produced from lactate such as 3-HP, 1,3-propanediol, acrylic acid, poly-acrylate, acrylate- 
esters, 3-HP-esters, and poly-3-HP. It is also noted that reduced polypeptide activities 

20 can be the result of lower polypeptide concentration, lower specific activity of a 

polypeptide, or combinations thereof. Many different methods can be used to make a cell 
having reduced polypeptide activity. For example, a cell can be engineered to have a 
disrupted regulatory sequence or polypeptide-encoding sequence using common 
mutagenesis or knock-out technology. See, e.g., Methods in Yeast Genetics (1997 

25 edition), Adams, Gottschling, Kaiser, and Sterns, Cold Spring Harbor Press <1998). 
Alternatively, antisense technology can be used to reduce the activity of a particular 
polypeptide. For example, a cell can be engineered to contain a cDNA that encodes an 
antisense molecule that prevents a polypeptide from being translated. The term 
"antisense molecule" as used herein encompasses any nucleic acid molecule or nucleic 

30 acid analog (e.g., peptide nucleic acids) that contains a sequence that corresponds to the 
coding strand of an endogenous polypeptide. An antisense molecule also can have 
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flanking sequences<e.g., regulatory sequences). Thus, antisense molecules can be 
ribozymes or antisense oligonucleotides. A ribozyme*can have any general structure 
including, without limitation, hairpin, hammerhead, or axhead structures, provided the 
molecule cleaves RNA. Further, gene silencing can be used to reduce the activity of a 

5 particular polypeptide. 

A cell having reduced activity of a polypeptide can be identified using any 
method. For example, enzyme activity assays such as those described herein can be used 
to identify cells having a reduced enzyme activity. 

A polypeptide having (1) the amino acid sequence set forth in SEQ ID NO:39 (the 

10 OS 1 7 polypeptide) or (2) an amino acid sequence sharing at least about 1>0 percent 

sequence identity with the amino acid sequence set forth in SEQ ID NO:39 can have three 
functional domains: a domain having CoA-synthatase activity, a domain having 3-HP- 
CoA dehydratase activity, and a domain having CdA-reductase activity. Such 
polypeptides can be selectively modified by mutating and/or deleting domains such that 

1*5 one or two of the enzymatic activities are reduced. Reducing the dehydratase activity of 
the OS 1 7 polypeptide can -cause acrylyl-CoA to be created from propionyl-CoA. The 
acrylyl-CoA then can be contacted with a polypeptide having Co A hydrolase activity to 
produce acrylate from propionate (Figure 43). Similarly, acrylyl-CoA can be created 
from 3-HP by using, for example, an OS17 polypeptide having reduced reductase 

20 activity. 

Production of Organic Acids and Related Products via In Vitro 
Techniques 

In addition, purified polypeptides having enzymatic activity can be used alone or 
25 in combination with cells to produce 3-HP or other organic compounds such as 1,3- 
propanediol, acrylic acid, polymerized acrylate, esters of acrylate, esters of 3-HP, and 
polymerized 3-HP. For example, a preparation containing a substantially pure 
polypeptide having 3-hydroxypropionyi-CoA dehydratase activity can be used to catalyze 
the formation of 3-HP-CoA, a precursor to 3-HP. Further, cell-free extracts containing a 
30 polypeptide having enzymatic activity can be used alone or in combination with purified 
polypeptides and/or cells to produce 3-HP. For example, a cell-free extract containing a 
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polypeptide having .CoA transferase activity can be used to form lactyl-CoA, while a 
microorganism containing polypeptides have the enzymatic activities necessary to 
catalyze the reactions needed to form 3 -HP from lactyl-CoA can be used to produce 3- 
HP. Any method can be used to produce a cell-free extract For example, osmotic shock, 

5 sonkation, and/or a repeated fieeze-thaw cycle followed by filtration and/or 
centrifugation can be used to produce a cell-free extract from intact cells. 
........ It is noted that a cell, purified polypeptide, and/orceUrfree .extract can be used to 

produce 3-HP that is, in turn, treated chemically to produce another compound. For 
example, a microorganism can be used to produce 3-HP, while a chemical process is used 

10 to modify 3-HP into a derivative such as polymerized 3-HP or an ester of 3-HP. 

Likewise, a chemical process can be used to produce a particular compound that is, in 
tumrConveFted4nta 3-HP^r-other organic-compound<e.g.^l,3--propanediol, acrylic acid, 
polymerized acrylate, esters of acrylate, esters of 3-HP, and polymerized 3-HP) using a 
qgjl, substantiall y pure poly peptide, and/or cell-free extract described herein. For 

15 example, a chemical process can be used to produce acrylyl-Co A, while a microorganism 
can be Used convert acrylyl-CoA into 3-HP. 

E. Fermentation of Cells to Produce Organic Acids 

Typically, 3-HP is produced by providing a produc tion cell, such as a 

20 microorganism, and culturing the microorganism with culture medium such that 3-HP is 
produced. In general, the culture media and/or culture conditions can be such that the 
microorganisms grow to an adequate density and produce 3-HP efficiently. For large- 
scale production processes, any method can be used such as those described elsewhere 
(Manual of Industrial Microbiology and Biotechnology, 2 nd Edition, Editors: A. L. 
25 Demain and J: E. Davies, ASM Press; and Principles of Fermentation Technology, P. F. 
Stanbuiy.and A. Whitaker, Pergamon). Briefly, a laiige tank<e.g., a 100 gallon, 200 
gallon, 500 gallon, or more tank) containing appropriate culture medium with, for 
example, a glucose carbon source is inoculated with a particular microorganism. After 
inoculation, the microorganisms are incubated to allow biomass to be produced. Once a 
30 desired biomass is reached, the broth containing the microorganisms can be transferred to 
a second tank. This second tank can be any size. For example, the second tank can be 
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larger, smaller, or the same size as the first tank. Typically, the second tank is larger than 
the first such that additional culture medium can be added to the broth from the first tank. 
In addition, the culture medium within this second tank can be the same as, or different 
from, that used inthe first tank. For example, the first tank can contain medium with 

5 xylose, while the second tank contains medium with glucose. 

Once transferred, the microorganisms can be incubated to allow for the production 
of 3-HP. Once produced, any method can be used to isolate the 3-HP. For example, 
common separation techniques can be used to remove the biomass from the broth, and 
common isolation procedures (e.g., extraction, distillation, and ion-exchange procedures) 

1 0 can be used to obtain the 3-HP from the microorganism-free broth. In addition, 3-HP can 
be isolated while it is being produced, or it can be isolated from the broth after the 
product production phase has been terminated. 

F. Products Created From the Disclosed-Biosynthetic Routes 
15 The organic compounds produced from any of the steps provided in Figures 1-5, 

43-44, 54, and 55 can be chemically converted into other organic compounds. For 
example, 3-HP can be hydrogenated to form 1,3 propanediol, a valuable polyester 
monomer. Hydrogenating an organic acid such as 3-HP can be performed using any 
method such as those used to hydrdgenate succinic acid and/or lactic acid. For example, 
20 3-HP can be hydrogenated using a metal catalyst. In another example, 3-HP can be 
dehydrated to form acrylic acid. Any method can be used to perform a dehydration 
reaction. For example, 3-HP can be heated in the presence of a catalyst <e.g., a metal or 
mineral acid catalyst^ to f orm acrylic acfdT Propanediol also can lie created using 
polypeptides having oxidoreductase activity (e;g., enzymes is the 1.1.1.- class of 
25 enzymes) in vitro or in vivo. 

V. Overview of Methodology Used to Create Biosynthetic Pathways 
That Make 3-HP from PEP 

The invention provides methods of making 3-HP and related products from PEP 
30 via the use of biosynthetic pathways. Illustrative examples include methods involving the 
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production of 3-HP via a lactate intermediate, a malonyl-CoA intermediate, and a B- 
alanine intermediate. 

A. Biosynthetic Pathway for Making 3-HP through a Lactic Acid 
5 Intermediate 

A biosynthetic pathway that allows for the production of 3-HP from PEP was 
constructed (Figure 1). This pamway involved using several polypeptides mat were 
cloned and expressed as described herein. M elsdemi cells (ATCC 1 7753) were used as • 
a source of genomic DNA. Primers were used to identify and clone a nucleic acid 
10 sequence encoding a polypeptide having CoA transferase activity (SEQ ID NO: 1). The 
polypeptide was subsequently tested for enzymatic activity and found to have CoA 
transferase activity. 

Similarly, PCR primers were used to identify nucleic acid sequences from M 
— tfsden ti genomic DNA that encoded an El activalui , E2 a, and E2 p p olypeptides (SEQ 
15 ID NOs: 9, 1 7, and 25, respectively). These polypeptides were subsequently shown to 
have lactyl-CoA dehydratase activity. 

Chlorqflexus aurantiacus cells (ATCC 29365) were used as a source of genomic 
_BNAr4nitiaW©mng4ead-to the identification-ofnucleicacid sequences: OS 17 (SEQ ID 
— --NO:129) and^SWtSEQ - !^ 
20 a polypeptide having CoA synthase activity; dehydratase activity, and dehydrogenase , 
activity (propionyl-CoA synthatase). Subsequence assays also revealed that OS 19 
-eneodes^TJolypepude having-3-hydroxypropionyl^Adehydratase activity (also 
referred to as acrylyl-CoA hydratase activity). 

Several operons were constructed for use in E. coli. These operons allow for the 
25 production of 3-HP in bacterial cells. Additional experiments allowed for the expression 
of these polypeptide is yeast, which can be used to produce 3-HP. 

B. Biosynthetic Pathway for Making 3-HP through a Malonyl-Co A 
Intermediate 

30 Another pathway leading to the production of 3-HP from PEP was constructed. 

This pathway used a polypeptide having acetyl CoA carboxylase activity that was isolated 
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from E. coli {Example 9), and a polypeptide having malonyl-CoA reductase activity that 
was isolated from Chlorqflexus aurantacius Example 10). The combination of these two 
polypeptides allows for the production of 3-HP from acetyl-CoA (Figure 44). 

Nucleic acid encoding a polypeptide having malonyl-CoA reductase activity (SEQ 
5 ID NO: 140) was cloned, sequenced, and expressed. The polypeptide having malonyl- 
CoA reductase activity was then used to make 3-HP. 

C. Biosynthetic Pathways For Making 3-HP through a 0-alanine 
Intermediate 

10 In general, prokaryotes and eukaryotes metabolize glucose via the Embden- 

Meyerhof-Pamas pathway to PEP, a central metabolite in carbon metabolism. The PEP 
generated from glucose is either carboxylated to oxlaoacetate or is converted to pyruvate. 
Carboxylation of PEP to oxaloacetate can be catalyzed by a polypeptide having PEP 
carboxylase activity, a polypeptide having PEP carboxykinase activity, or a polypeptide 

15 having PEP transcarboxylase activity. Pyruvate that is generated from PEP by a 

polypeptide having pyruvate kinase activity can also be converted to oxaloacetate by a 
polypeptide having pyruvate carboxylase activity. 

Oxaloacetate generated either from PEP or pyruvate can act as a precursor for 
production of aspartic acid. This conversion can be carried out by a polypeptide having 

20 aspartate aminotransferase activity, which transfers an amino group from glutamate to 
oxaloacetate. Glutamate consumed in this reaction can be regenerated by the action of a 
polypeptide having glutamate dehydrogenase activity or by the action of a polypeptide 
having 4, 4-aminobutyrate aminotransferase activity. The decarboxylation of aspartate to 
P-alanine is catalyzed by a polypeptide having aspartate decarboxylase activity, ^-alanine 

25 produced through this biochemistry can be converted to 3-HP via two possible pathways. 
These pathways are provided in Figures 54 and 55. 

The steps involved in the production of -p-alanine can be the same for both 
pathways. These steps -can be accomplished by endogenous polypeptides in the host cells 
which convert PEP to P-alanine, or these steps can be accomplished with recombinant 

30 DNA technology using known polypeptides such as polypeptides having PEP- 
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carboxykinase activity (4.1.1.32), aspartate aminotransferase activity {2.6.1.1), and 
aspartate alpha-decarboxylase activity (4.1.1.11). 

As depicted in Figure 54, a polypeptide having Co A transferase activity (e.g., a 
polypeptide having a sequence set forth in SEQ ID NO:2) can be used to convert |J- 

5 alanine to p-alanyl-GoA. p-alanyl-CoA can be converted to acryiyK^oA via a 

polypeptide having {J-alanyl-CoA ammonia lyase activity (e.g., a polypeptide having a 
sequence set forth in SEQ ID NO: 160). Acrylyl-CoA can be converted to 3-HP-CoA 
using a polypeptide having 3-HP-CoA dehydratase activity (e.g., a polypeptide having a 
sequence set forth in SEQ ID NO:40). 3 -HP-Co A can be converted into 3-HP via a 

10 polypeptide having CoA transferase activity (e.g., a polypeptide having a sequence set 
forth in SEQ ID NO:2). 

As depicted in Figure 55, a polypeptide having 4,4-aminobutyrate 
aminotransferase activity <2.6. 1 . 1 9) can be used to invert P-alanine into malonate 
semialdehyde. The malonate semialdehyde can be converted into 3-HP using either a 

1 5 polypeptide having 3-hydroxypropionate dehydrogenase activity (1 .1 .1 .59) or a 
polypeptide having 3-hydroxyisobutyrate dehydrogenase activity. 

EXAMPLES 
Example 1 - Cloning nucleic acid molecules that 
20 encode a polypeptide having CoA transferase activity 

Genomic DNA was isolated from Megasphaera elsdenii cells <ATCC 17753) 
grown in 1053 Reinforced Clostridium media under anaerobic conditions at 37°C in toll 
tubes for 12-14 hours. Once grown, the cells were pelleted, washed with 5 mLof a 10 
mM Tris solution, and repelleted. The pellet was resuspended in 1 mLof Gentra Cell 
25 Suspension Solution to which 14.2 mg of lysozyme and 4 pL of 20 mg/mL proteinase K 
solution was added. The cell suspension was incubated at 37°C for 30 minutes. The 
genomic DNA was than isolated using a Gentra Genomic DNA Isolation Kit following 
the provided protocol. The precipitated genomic DNA was spooled and air-dried for 10 
minutes. The genomic DNA was suspended in 500 |iL of a 10 mM Tris solution and 
30 stored at 4°C. 
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Two degenerate forward (CoAFl and CoAF2) and three degenerate reverse 
(CoARl , CoAR2, and Co AR3) PCR primers were designed based on conserved 
aoetoaeetyl CoA transferase and propionate CoA transferase sequences {Co AF1 5*- 
GAA WSCGGYSCNATYGGYGG-3 *, SEQ ID NO: 49; CoAF2 S'-TTYTGYG- 

5 GYRSBTTYACBGCWGG-3 SEQ ID NO: 50; CoARl 5'-CCWGCVGTRAAV- 
S YRCCRC ARAA-3 * , SEQ ID NO: 51; CoAR2 5*-AARACDSMRCGTTCVGTRA- 
TRTA-3\ SEQ ID NO: 52; and CoAR3 5*-TCRAYRCCSGGWGCRAYTTC-3\ SEQ ID 
NO: 53). The primers were used in all logical combinations in PCR using Taq 
polymerase (Roche Molecular Biochemicals, Indianapolis, IN) and 1 ng of genomic DNA 

10 per |iL reaction mix. PCR was conducted using a touchdown PCR program with 4 cycles 
at an annealing temperature of 59°C, 4 cycles at 57°C, 4 cycles at 55°C, and 18 cycles at 
52°C. Each cycle used an initial 30-second denaturing step at 94°C and a 3 minute 
extension at 72°C. The program had an initial denaturing step for 2 minutes at 94°C and 
a final extension step of 4 minutes at 72°C. Time allowed for annealing was 45 seconds. 

15 The amounts of PCR primer used in the reactions were increased 2-8 fold above typical 
PCR amounts depending on the amount of degeneracy in the 3* end of the primer. In 
addition, separate PCR reactions containing each individual primer were made to identify 
PCR products resulting from single degenerate primers. Each PCR product (25 \xL) was 
separated by electrophoresis using a 1% TAE (Tris-acetate-EDTA) agarose gel. 

20 The CoAFl-CoAR2, CoAFl-CoAR37CoAF2-CoAR2, and CoAF2-CoAR3 

combinations produced a band of 423, 474, 177, and 228 bp, respectively. These bands 
matched the sizes based on other CoA transferase sequences. No band was visible from 
Ifieindivrdual primer control reactions. The CdAFl-CoAR3 fragment (474 bp) was 
isolated and purified using a Qiagen Gel Extraction Kit (Qiagen Inc., Valencia, CA). 

25 Four fiL of the purified band was ligated into pCRII vector and transformed into TOP10 
& coli cells by heat-shock using a TOPO cloning procedure (Invitrogen, Carlsbad, CA). 
Transformations were plated on LB media containing 100 pg/mL of ampicillin (Amp) 
and 50 pg/mL of 5-Bromo-4-Chloro-3-Indolyl-5-D-X}alactopyranoside (X-gal). Single, 
white colonies were plated onto fresh media and screened in a PCR reaction using the 

30 CoAFl and CoAR3 primers to confirm the presence of the insert 
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Plasmid DNA obtained using a QiaPrep Spin Miniprep Kit<Qiagen, Inc) was 
quantified and used for DNA sequencing with M13R and M13F primers. Sequence 
analysis revealed that the CoAFl-CoAR3 fragment shared sequence similarity with 
acetoacetyl CoA transferase sequences. 

5 Genome walking was performed to obtain the complete coding sequence. The 

following primers for genome walking in both upstream and downstream directions were 
designed using the portion of the 474 bp CoAFl-CoAR3 fragment sequence that was 
internal to the degenerate primers (COAGSP1F 5 ' -G AATGTTT ACTTCTGCGG- 
CACCTTCAC-3*, SEQ ID NO:54; COAGSP2F S'-GACCAGATCACTTTCAACG. 

10 GTTCCTATG-3 \ SEQ ID NO:55; COAGSP1R 5*-GCATAGGAACCGTTGAAA- 
GTG ATCTGG-3 • , SEQ ID NO:56; and COAGSP2R S'-GTTAGTACCGAACTTG- 
CTGACGTTGATG-3 9 , SEQ ID NO:57). The COAGSP1F and COAGSP2F primers face 
downstream, while the COAGSP1R and COAGSP2R primers face upstream. In addition, 
the CQAGSP2F and GOAGSP2R primers are nested inside the COAGSP1F and 

15 COAGSP1R primers. Genome walking was performed using the Universal Genome 
Walking kit (ClonTech Laboratories, Inc., Palo Alto, CA) with the exception that 
additional libraries were generated with enzymes Nru I, Sea I, and Hinc II. First round 
PCR was conducted in a Perkin Elmer 2400 Thermocycler with 7 cycles of 2 seconds at 
94°C and 3 minutes at 72°C, and 36 cycles of 2 seconds at 94°C and 3 minutes at 65°C 

20 with a final extension at 65°C for 4 minutes. Second round PCR used 5 cycles of 2 

seconds at 94°C and 3 minutes at 72°C, and 20 cycles of 2 seconds at 94°C and 3 minutes 
at 65°C with a final extens ion at 65°C for 4 minutes. The first and second round product 
(20 nL) was separated by electrophoresis on a 1% TAE agarose gel. Amplification 
products were obtained with the Stu I library for the reverse direction. The second round 

25 product of 1 .5 Kb from this library was gel purified, cloned, and sequenced. Sequence 
analysis revealed that the sequence derived from genome walking overlapped with the 
Co AF 1 -Co AR3 fragment and shared sequence similarity with other sequences such as 
acetoacetyl CoA transferase sequences (Figures 8-9). 

Nucleic acid encoding the CoA transferase (propionyi-CoA transferase ox pet) 

30 from Megasphaera elsdenii was PCR amplified from chromosomal DNA using following 
PCR program: 25 cycles of 95°C for 30 seconds to denature, 50°C for 30 seconds to 
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anneal, and 72°C for 3 minutes for extension^plus 2 seconds per cycle). The primers 
used were designated PCT-1.1 14 <5 , -ATGAGAAAAGTAGAAATCATTAC-3 , j SEQ ID 
NO:58) and PCT-2.2045 (5 '-GGCGGAAGTTGACGATAATG-3 • ; SEQ ID NO:59). The 
resulting PCR product (about 2 kb as judged by agarose gel electrophoresis) was purified 

5 using a Qiagen PCR purification kit (Qiagen Inc., Valencia, CA). Hie purified product 
was ligated to pETBlue-1 using the Perfectly Blunt cloning Kit (Novagen, Madison, WI). 
The ligation reaction was transformed into NovaBlue chemically competent cells 
(Novagen, Madison, WI) that were spread on LB agar plates supplemented with 50 
Hg/mL carbenicillin, 40 Hg/mL IPTG, and 40 ^ig/mL X-Gal. White colonies were isolated 

10 and screened for the presence of inserts by restriction mapping. Isolates with the correct 
restriction pattern were sequenced from each end using the primers pETBlueUP and 
pETBlueDOWN (Novagen) to confirm the sequence at the ligation points. 

The plasmid was transformed into Tuna: (DE3) pLacI chemically competent cells 
(Novagen, Madison, WI), and expression from the construct tested. Briefly, a culture was 

1 5 grown overnight to saturation and diluted 1 :20 the following morning in fresh LB 

medium with the appropriate antibiotics. The culture was grown at 37°C with aeration to 
anOD6ooof about 0.6. The culture was induced with IPTG at a final concentration of 100 
fiM. The culture was incubated for an additional two hours at 37°C with aeration. 
Aliquots were taken pre-induction and 2 hours post-induction for SDS-PAGE analysis. A 

20 band of the expected molecular weight (55,653 Daltons predicted from the sequence) was 
observed after IPTG treatment This band was not observed in cells containing a plasmid 
lacking the nucleic acid encoding the transferase. 

Cell free extracts were prepared to assess enzymatic activity. Briefly, the cells 
were harvested by centrifugation and disrupted by sonication. The sonicated cell 

-25 suspension was centrifuged to remove cell debris, and the supernatant was used in the 
assays. 

Transferase activity was measured in the following assay. The assay mixture used 
contained 100 mM potassium phosphate buffer (pH 7.0), 200 mM sodium acetate, 1 mM 
dithiobisnitrobenzoate (DTNB), 500 jxM oxaloacetate, 25 pM CoA-ester substrate, and 3 
30 [ig/mL citrate synthase. If present, the CoA transferase transfers the CoA from the CoA 
ester to acetate to form acetyl-CoA. The added citrate synthase condenses oxaloacetate 
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and acetyl-CoA to form citrate and free CoASH. The free CoASH complexes with 
DTNB, and the formation of this complex can be measured by a change iii the optical 
density at 412 nm. The activity of the CoA transferase was measured using the following 
substrates: lactyl-CoA, propionyl-CoA, acrylyl-CoA, and 3-hydroxypropionyl-CoA. The 
5 units/mg of protein was calculated using the following formula: 

(AE/min * V f * dilution factor)/ (V s * 14.2) = units/mL 

where AE/min is the change in absorbance per minute at 4 12 nm, Vf is the final volume of 
the reaction, and V s is the volume of sample added. The total protein concentration of the 
10 cell free extract was about 1 mg/mL so the units/mL equals units/mg. 

Cellfrree extracts from cells containing nucleic acid encoding the CoA transferase 
exhibited activity was 

detected for the lactyl-CoA, propionyl-CoA, acrylyl-CoA, and 3-hydroxypropionyl-CoA 
—substrates (T^lcr2):"rhe1iighe^ was detected for lactyl-CoA 

15 and propionyl-CoA. 

• Table 2 





Substrate 


Units/mg 




Lactyl-CoA 


211 


' j 


Propionyl-CoA 


144 


Acrylyl-CoA 


118 




3-Hydroxypropionyl-CoA 


110 



The following assay was performed to test whether the CoA transferase activity 
can use the same CoA substrate donors as recipients. Specifically, CoA transferase 
20 activity was assessed using a Matrix-assisted Laser Desorption/Ionization Time of Flight 
Mass Spectrometry (MALDI-TOF MS) Voyager RP workstation (PerSeptive 
fiiosystems). The following five reactions were analyzed: 

1) acetate + lactyl-CoA -> lactate + acetyl-CoA 

2) acetate + propionyl-CoA -> propionate + acetyl-CoA 
25 3) lactate + acetyl-CoA -> acetate + lactyl-CoA 

4) lactate + acrylyl-CoA -> acrylate + lactyl-CoA 
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5) 3-hydroxypropionate + lactyi-CoA ■> lactate + 3-hydroxypropionyl-CoA 

MALDI-TOF MS was used to measure simultaneously the appearance of the 
product Co A ester and the disappearance of the donor Co A ester. The assay buffer 

5 contained 50 mM potassium phosphate (pH 7.0), 1 mM CoA ester, and 100 mM 
respective acid salt Protein from a cell free extract prepared as described above was 
added to a final concentration of 0.005 mg/mL. A control reaction was prepared from a 
<;ell free extract prepared from cells lacking the construct containing the CoA transferase- 
encoding nucleic acid. For each reaction, the cell free extract was added last to start the 

10 reaction. Reactions were allowed to proceed at room temperature and were stopped by 
adding 1 volume 10% trifluroacetic acid (TFA). The reaction mixtures were purified 
prior to MALDI-TOF MS. analysis using Sep Pak Vac Cig 50 mg columns (Waters, Inc.). 
The columns were conditioned with 1 mL methanol and equilibrated with two washes of 
1 mL 0.1% TFA. Each sample was applied to the column, and the flow through was 

15 discarded. The column was washed twice with 1 mL 0.1% TFA. The sample was eluted 
in 200 i*L 40% acetonitrile, 0.1% TFA. The acetonitrile was removed by centrifugation 
in vacuo. Samples were prepared for MALDI-TOF MS analysis by mixing 1 : 1 with 110 
mM sinapinic acid in 0.1% TFA, 67% acetonitrile. The samples were allowed to air dry. 
In reaction #1, the control sample exhibited a main peak at a molecular weight 

20 corresponding to lactyl-CoA (MW 841). There was a minor peak at the molecular weight 
coiTespondingtoacetyl-CoA(MW811). This minor peak was determined to be the left- 
over acetyl-CoA from the synthesis of lactyl-CoA. The reaction #1 sample containing the 
cell extract from cells transfected with the CoA transferase-encoding plasmid exhibited 
complete conversion of lactyl-CoA to acetyl-CoA. No peak was observed for lactyl-CoA. 

25 This result indicates that the CoA transferase activity can transfer CoA from lactyl-CoA 
to acetate to form acetyl-CoA. 

In reaction #2, the control sample exhibited a dominant peak at a molecular 
weight corresponding to propionyl-CoA<MW 825). The reaction #2 sample containing 
the^ell extract from cells transfected with the CoA transferase-encoding plasmid 

30 exhibited a dominant peak at a molecular weight corresponding to acetyl-CoA (MW 811). 



55 



BNSOOCID: <WO__02424iaA2JA> 



WO 02/042418 



PCT/US01/43607 



No peak was observed for propionyl-CoA. This result indicates that the CoA transferase 
activity can transfer CoA from propionyl-CoA to acetate to form acetyl-CoA. 

In reaction #3, the control sample exhibited a dominant peak at a molecular 
weight corresponding to acetyl-CoA (MW 81 1). The reaction #3 sample containing the 
5 cell extract from cells transfected with the CoA transferase-encoding plasmid exhibited a 
peak corresponding to lactyl-CoA (MW 841). The peak corresponding to acetyl-CoA did 
not disappear. Infacti the ratio of the size of the two peaks was about 1:1. The observed 
appearance of the peak corresponding to lactyl-CoA demonstrates that the CoA 
transferase activity catalyzes reaction #3. 
10 In reaction #4, the control sample exhibited a dominant peak at a molecular 

weight corresponding to acrylyl-CoA<MW 823). The reaction #4 sample containing the 
cell extract from cells transfected with the CoA ^transferase^codmgTilasnu^xhroi^ a 
dominant peak corresponding to lactyl-CoA (MW 841). This result demonstrates that the 

^CoA4Fan5ferase-activity-catalyzes r ea cti o n #4 , 

1 5 In reaction #5, deuterated lactyl-CoA was used to detect the transfer of CoA from 

lactate to 3-hydroxypropionate since lactic acid and 3 -HP have the^ame molecular 
weight as do their respective CoA esters. Using deuterated lactyl-CoA allowed for the 
differentiation between lactyl-CoA and 3-hydroxypropionate using MALDI-TOF MS. 
The con t rol sa mple #xh\h \irA « rii ffiis p. group of pe.aks.atrooleculat:weights ranging from 
20 MW 84 1 to 845 due to the varying amounts of hydrogen atoms that were replaced with 
deuterium atoms. In addition, a significant peak was observed at a molecular weight 
corresponding to acetyl-CoA (MW 81 1). This peak was determined to be the left-over 
acetyl-CoA from the synthesis of lactyl-CoA. The reaction #5 sample containing the cell 
extract from cells transfected with the CoA transferase-encoding plasmid exhibited a 
25 dominant peak at a molecular weight con^ponding to 3-hydroxypropionyi-CoA (MW 
841) as opposed to a group of peaks ranging from MW 841 to 845. This result 
demonstrates that me CoA transferase catalyzes reaction #5. 



56 



WO 02/04241! 



PCT/US01/43607 



Example 2 - Cloning nucleic acid molecules that encode a 

multiple polypeptide complex having lactvl-CoA dehydratase activity 

i 

The following methods were used to clone an El activator polypeptide. Briefly, 
four degenerate forward and five degenerate reverse PCR primers were designed based on 
5 conserved sequences of El activator protein homologs^E IF 1 5'- GCWACBGGY- 

TAYGGYCG-3\ SEQ ID NO:60; E1F2 5 ' -GTYRTYGA YRTYGGYGG YC AGGA-3 ' , 
SEQ ID NO:61; E1F3 5'-ATGAACGAYAARTGYGCWGCWGG-3\ SEQ ID NO:62; 
E1F4 5-TGYGCWGCWGGYACBGGYCGYTT-3', SEQ ED NO:63; E1R1 "S'-TCCT- 
GRCCRCCRAYRTCRAYRAC-3', SEQ ID NO:64; E1R2 5'-CCWGCWGCRCAY- 
10 TTRTCGTTC AT-3 * , SEQ ID NO:65; E1R3 5'-AARCGRCCVGTRCCWGCWG-CRCA- 
3', SEQ ID NO:66; E1R4 5'- GCTTCGS WTTCRACRATGS W-3 ' , SEQ ID NO:67; and 
E1R5 5 ' -GS WRATRACTTCGC WTTC WGCRAA-3 ' , SEQ ID NO:68). 

The primers were used in all logical combinations in PCR using Taq polymerase 
{Roche Molecular Biochemicals, Indianapolis, IN) and-1 ng of genomic DNA per jiL 
15 reaction mix. PCR was conducted using a touchdown PCR program with 4 cycles at an 
annealing temperature of 60°C, 4 cycles at 58°C, 4 cycles at 56°C, and 18 cycles atS4°C. 
Each cycle used an initial 30-second denaturing step at 94°C and a 3 minute extension 
step at 72°C. The program had an initial denaturing step for 2 minutes at 94°C and a final 
extension step of 4 minutes at 12 p O. Time allowed for ^ annealing was 45 seconds. The 
20 amounts of PCR primer used in the reactions were increased 2-10 fold above typical PCR 
amounts depending on the amount of degeneracy in the 3' end of the primer. In addition, 
separate PCR reactions containing each individual primer were made to identify PCR 
product resulting from single degenerate primers. Each PCR product <25 uL) was 
separated by electrophoresis using a 1% TAE (Tris-acetate-EDTA) agarose gel. 
25 The E1F2-E1R4, E1F2-E1R5, E1F3 : E1R4, E1F3-E1R5, and E1F4-E1R4R2 

combinations produced a band of 195, 207, 144, 156, and 144 bp, respectively. These 
bands matched the expected size based on El activator sequences from other species. No 
band was visible with individual primer control reactions. The E1F2-E1R5 fragment 
(207 bp) was isolated and purified using Qiagen Gel Extraction procedure (Qiagen Inc., 
30 Valencia, CA). The purified band<4 nL) was ligated into apCRU vector that then was 
transformed into TOP10 E. coli cells by heat-shock using a TOPO cloning procedure 
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{Invitrogen, Carlsbad, CA). Transformations were plated on LB media containing 100 
H.g/mL of ampicillin{Amp) and 50 jig/mL of 5-Bromo-4-Chloro-3-Indolyl-S-D* 
Galactopyranoside<X-gal). Single, white colonies were plated onto fresh media and 
screened in a PCR reaction using the E1F2 and E1R5 primers to confirm the presence of 
5 the insert Plasmid DNA was obtained from multiple colonies using a QiaPrepSpin 
Miniprep Kit<Qiagen, Inc). Once obtained, the plasmid DNA was quantified and used 
for DNA sequencing with M13R and M13F primers. Sequence analysis revealed a 
nucleic acid -sequence encoding a polypeptide and revealed that the E1F2-E1R5 fragment 
shared sequence similarity with El activator sequences (Figures 12-13). 
10 Genome walking was performed to obtain the complete coding sequence of E2 a 

and p subunits. Briefly, four primers for performing genome walking in both upstream 
and downstream directions, were designed using the portion of the 207 bp E1F2-E1R5 
fragment sequence that was internal to the E1F2 and E1R5 degenerate primers (E1GSP1F 
5'-ACGTCATGTCGAAt3GTACTGGAAATCC-3*, SEQ ID NO:69; E1GSP2F 5'- 
15 GGGACTGGTACTTCAAATCGAAGCATC-3 *, SEQ ID NO:70; E1GSP1R3 1 - 

TGACGGCAGGGGGATGCTTCGATTTGA-3% SEQ ID NO:71; and E1GSP2R 5'- 
TCAGACATGGGGATTTCCAGTACCTTC-3% SEQ CD NO:72). The E1GSP1F and 
E1GSP2F primers face downstream, while theElGSPIR and E1GSP2R primers face 
upstream. In addition, the E1GSP2F and E1GSP2R primers are nested inside the 
20 ElGSPlF and E1GSP1R primers. 

Genome walking was performed using the Universal Genome Walking Kit 
{ClonTech Laboratories, Inc., Palo Alto, CA) with the exception that additional libraries 
were generated with enzymes Nru I, Sea I, and Hinc II. First round PCR was performed 
in a Perkin Elmer 2400 Thermocycler with 7 cycles of 2 seconds at 94°C and 3 minutes at 
25 72°C, and 36 cycles of 2 seconds at 94°C and 3 minutes at 65°C with a final extension at 
65°C for 4 minutes. Second round PCR used 5 cycles of 2 seconds at 94°C and 3 minutes 
at 72°C, and 20 cycles of 2 seconds at 94°C and 3 minutes at 65°C with a final extension 
at 65°C for 4 minutes. The first and second round product (20 yL) was separated by 
electrophoresis using 1% TAE agarose gel. Amplification products were obtained with 
30 the Ste I library for both forward and reverse directions. The second round product of 
about 1 .5 kb for forward direction and 3 kb fragment for reverse direction from the Stu I 
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library were gel purified, cloned, and sequenced. Sequence analysis revealed that the 
sequence derived from genome walking overlapped with the E1F2-E1R5 fragment 

To obtain additional sequence, a second genome walk was performed using a first 
round primer (E1GSPF5 5 , -CCGTGTTACTTGGGAA<3GTATGGCTGTCTG-3^ SEQ 

5 ID NO:73) and a second round primer (E1GSPF6 5 '-GCCAATGAAGGAGGAAA- 
CC ACTAATG AGTC-3 ' , SEQ ID NO:74). The genome walk was performed using the 
Nrul, Seal, and HincTL libraries. In addition, ClonTech's Advantage-=Genomic 
Polymerase was-usedrfor-the PGR.- First round PGR was performed in-a Perkin Elmer 
2400 Thennocycler with an initial denaturing step at 94°C for 2 minutes, 7 cycles of 2 

10 seconds at 94°C and 3 minutes at 72°C, and 36 cycles of 2 seconds at 94°C and 3 minutes 
at 65°C with a final extension at 65°C for 4 minutes. Second round PGR used 5 cycles of 
2 seconds at 94°C and 3 minutes at 72°C, and 20 cycles of 2 seconds at 94°C and 3 
minmes at 65°e™tbr*fin^^ 

product (20 *iL) was separated by electrophoresis on a 1% agarose gel. An about 1.5 kb 

15 ^amphfcatio^p^ 

band was gel purified, cloned, and sequenced. Sequence analysis revealed that it 
overlapped with the previously obtained genome walk fragment In addition, sequence 
analysis revealed a nucleic acid sequence encoding an E2 a subunit that shares sequence 
similarities with other sequences (Figures 16-17). Further, sequence analysis revealed a 

20 ^ nucleic a c i d s eque nce fl o o d ing a n £2 ft subu ni t t h at share s sequence-s i m i la riti es with 
other sequences {Figures 20-21). 

Additional PCR and sequence analysis revealed the order of polypeptide encoding 
sequences within the region containing the lactyl-CoA dehydratase-encoding sequences. 
Specifically, the E1GSP1F and COAGSP1R primer pair and the COAGSP1F and 

25 E1GSP1R primer pair were used to amplify fragments that encode both the Co A 

transferase and El activator polypeptides. Briefly, M elsdenii genome DNA (1 ng) was 
used as a template. The PCR was conducted in Perkin Elmer 2400 Thennocycler using 
Long Template Polymerase (Roche Molecular Biochemicals, Indianapolis, IN). The 
PCR program used was as follows: 94*C for 2 minutes; 29 cycles of 94°C for 30 seconds, 

30 61°C for 45 seconds, and 72°C for 6 minutes; and a final extension of 72°C for 10 
minutes. Both PCR products <20 jiL) were separated on a 1% agarose gel. An 
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amplification product (about 1 .5 kb) was obtained using the GOAGSP1F and E1GSP1R 
primer pair. This product was gel purified, cloned, and sequenced (Figure 22). 

The organization of the M. elsdenii operon containing the lactyl-CoA dehydratase- 
encoding sequences was determined to containing the following polypeptide-encoding 
5 sequences in the following order: CoA transferase (Figure 6), ORFX (Figure 23), El 
activator protein of lactyl-CoA dehydratase (Figure 10), E2 a subunit of lactyUCoA 
dehydratase (Figure 14), E2 p subunit of lactyl-CoA dehydratase (Figure 18), and 
truncated CoA dehydrogenase (Figure 25). 

The lactyl-Co A dehydratase (lactyl-CoA dehydratase or led) from M elsdenii was 
10 PCR amplified from chromosomal DNA using the following program: 94°C for 2 

minutes; 7 cycles of 94°C for 30 seconds, 47°C for 45 seconds, and 72°C for 3 minutes; 
25 cycles of 94°C for 30 seconds, 54°C for 45 seconds, and 72°C for 3 minutes; and 72°C 
for 7 minutes. One primer pair was used(OSNBElF 5'-GGGAATTCCATATG- 
AAAACTGTGTATACTCTC-3 * , SEQ ID NO:75 and OSNBE 1 R 5 * -CGACGGAT- 
15 CCTTAGAGGATTTCCGAGAAAGC-3\ SEQ ID NO:76). The amplified product 
(about 3.2 kb) was separated on 1% agarose gel, cut from the gel, and purified with a 
Qiagen Gel Extraction kit (Qiagen, Valencia, CA). The purified product was digested 
with Nde I and BamKL restriction enzymes and ligated into pETl la vector (Novagen) 
digested with the same enzymes. The ligation reaction was transformed into NovaBlue 
20 chemically competent cells (Novagen) that then were spread on LB agar plates 

supplemented with 50 jig/mL carbenicillin. Isolated individual colonies were screened 
for the presence of inserts by restriction mapping. Isolates with the correct restriction 
pattern were sequenced from each end using Novagen primers (T7 promoter primer 
#69348-3 and T7 terminator primer #69337-3) to confirm the sequence at the ligation 
25 points. 

A plasmid having the correct insert was transformed into Tuner (DE3) pLacI 
chemically competent cells (Novagen, Madison, WI). Expression from this instruct was 
tested as follows. A culture was grown overnight to saturation and diluted 1 :20 the 
following morning in fresh LB medium with the appropriate antibiotics. The culture was 
30 grown at 37°C with aeration to an ODeoo of about 0.6. The culture was induced with 

BPTG at a final concentration of 100 *iM. The culture was incubated for an additional two 
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hours at 37°C with aeration. Aliquots were taken pie-induction and 2 hours post- 
induction for SDS-PAGE analysis. Bands of the expected molecular weight (27,024 
Daltons for the El subunit, 48,088 Daltons for the E2 a subunit, and 42,517 Daltons for 
the E2 § subunit — all predicted from the sequence) were observed. These bands were not 

5 observed in cells containing a plasmid lacking the nucleic acid encoding the three 
components of the lactyl-Co A dehydratase. 

Cell free extracts were prepared by growing cells in a sealed serum bottle 
overnight at 37 c O. Following overnight growth, the cultures were induced with 1 inM 
IPTG -(added using anaerobic technique) and incubated an additional 2 hours at 37°C. The 

10 cells were harvested by centrifiigation and disrupted by sonication under strict anaerobic 
conditions. The sonicated cell suspension was centrifuged to remove cell debris, and the 
supernatant was used in the assays. The buffer used for cell resuspension/sonication was 
50 mM Tris-HCl <pH 7.5), 200 nM ATP, 7 mM Mg(S0 4 ), 4 mM DTT, 1 mM dithionite, 
and 100 pMNADH. 

15 Dehydratase activity was detected with MALDI-TOF MS. The assay was 

conducted in the same buffer as above with 1 mM lactyl-CoA or 1 mM acrylyl-CoA 
added and about 5 mg/mL cell free extract Prior to MALDI-TOF MS analysis, samples 
were purified using Sep Pak Vac Qg columns (Waters, Inc.) as described in Example 1. 
Hie following two reactions were analyzed: 

20 1) acrylyl-CoA -> lactyl-CoA 

2)iactyl-CoA acrylyl-CoA 

In reaction #1 , the control sample exhibited a peak at a molecular weight 
corresponding to acrylyl-CoA (MW 823). The reaction #1 sample containing the cell 
25 extract from cells transfected with the dehydratase-encoding plasmid exhibited a major 
peak at a molecular weight corresponding to lactyl-Co A (MW 841). Hiis result indicates 
that the dehydratase activity can convert acrylyl-CoA into lactyl-CoA. 

To detect dehydratase activity on lactyl-CoA, reaction #2 was carried out in 80% 
D2O. The control sample exhibited a peak at a molecular weight corresponding to lactyl- 
30 CoA (MW 841). The reaction #2 sample containing thecell extract from cells transfected 
with the dehydratase-encoding plasmid revealed a lactyl-CoA peak shifted to a deuterated 
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fonn. This result indicates that the dehydratase -enzyme is active on lactyl-GoA. In 
addition, the results from bothTeactions indicate that the dehydratase enzyme can 
catalyze the lactyl-CoA <-> acrylyi-GoA reaction in both directions. 

5 Example 3 - Cloning nucleic acid molecules that encode 

a polypeptide having 3-hydroxvpropionyl Co A dehydratase activity 

Genomic DNA was isolated from Chloroflexus aurantiacus - cells (ATCC 29365). 

Briefly, C. aurantiacus cells in 920 Chloroflexus medium were grown in 50 mL cultures 
tFalcon 2070 polypropylene tubes) using an Innova 4230 Incubator, Shaker<New 
10 Brunswick Scientific; Edison, NJ) at 50°C with interior lights. Once grown, the cells 
were pelleted, washed with 5 mL of a 10 mM Tris solution, and re-pelleted. Genomic 
DNA was isolated from-the pelleted=«ells using a Centra Genomic '^Puregene^ DNA 
isolation kit (Gentra Systems; Minneapolis, MN). Briefly, the pelleted cells were 
reairpendH in I mL ^"tra Cell Suspensi on SolutioniQ_which-14,2 ing of lysbzyme and 
15 4 uL of 20 mg/mL proteinase K solution was added. The cell suspension was incubated 
at 37°C for 30 minutes. The precipitated genomic DNA was recovered by centrinigauon 
at 3500 x g for 25 minutes and air-dried for 10 minutes. The genomic DNA was 
suspended m 300 pL of a 10 mM Tris solution and stored at 4°C. 

Th» gnomic HNA was used as a template in PCR amplification reactions with' 

20 primers designed based on conserved domains of crotonase homologs and a Chloroflexus 
aurantiacus codon usage table. Briefly, two degenerate forward (CRF1 and CRF2) and 
three degenerate reverse (CRR1, CRR2, and CRR3) PCR primers were designed (CRF1 
5'-AAYCGBCCVAARGCNCTSAAYGC-3', SEQ ID NO:77; CRF2: 5'- 
TTYGTBGCNGGYGCN GA YAT-3 ' , SEQ n>NO:78; CRR1 5'-ATRTCNG- 
25 CRCCNGCVACRAA-3', SEQ ID NO:79; CRR2 5'-CCRCCRCCSAGNG- 

CRWARCCRTT-3\ SEQ ID NO-.80; and CRR3 5 ' -SS WNGCRATVCGRATRTCRAC- 
3',SEQIDNO:81). 

These primers were used in all logical combinations in PCR using Taq polymerase 
(Roche Molecular Biochemicals; Indianapolis, IN) and 1 ng of the genomic DNA per uL 
30 reaction mix. The PCR was<onducted using a touchdown PCR program with 4 cycles at 
an annealing temperature of 61°C, 4 cycles at59°C, 4 cycles at 57°C, 4 cycles at 55°C, 
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and 16 cycles at 52°C. Each cycle used an initial 30-second denaturing step at 94°C and 
a 3-minute extension step at 72°C. The program also had an initial denaturing step for 2 
minutes at 94°C and a final extension step of 4 minutes at 72°C. The time allowed for 
annealing was 45 seconds. The amounts ofPCR primer used in the reaction were 

5 increased 4-12 fold above typical PCR amounts depending on the amount of degeneracy 
in the 3* end of the primer. In addition, separate PCR reactions containing each 
individual primer were performed to identify amplification products resulting from single 
degenerate primers. Each PCR product (25 uL) was separated by gel electrophoresis 
using a 1% TAE (Tris-acetate-EDTA) agarose gel. 

10 The CRF1-CRR1 and CRF2-CRR2 combinations produced a unique band of 

about 120 and about 150 bp, respectively. These bands matched the expected size based 
on crotonase genes from other species. No 120 bp or 150 bp band was observed from 
mdividual primer controrr^ the 120 bp and 150 bp bands) 

were isolated and purified using the Qiagen Gel Extraction kit<Qiagen Inc., Valencia, 

15 CA). Eacb iwnfigfii^n» 

transformed into TOP10 E. cott cells by a heat-shock method using a T0P.0 cloning 
procedure (Invitrogen, Carlsbad, CA). Transformations were plated on LB media 
containing 100 Ug/mL of ampiciilin (Amp) and 50 Ug/mL of 5-Bromo-4-Chloro-3- 
Indolyl-5-D-Galactopyranoside(X-gal). Single, white colonies were plated onto fresh 
20 "media and screened m a PCRreaction usm^ and the CRF2 

and CRR2 primers to confirm the presence of the desired insert Plasmid DNA was 
obtained from multiple colonies with the desired insert using a QiaPrep Spin Miniprep 
KittQiagen, Inc.). Once obtained, the DNA was quantified and used for DNA 
sequencing with M13R and M13F primers. Sequence analysis revealed the presence of 
25 two different clones from tire PCR product of about 150 bp. Each shared sequence 
similarity with crotonase and hydratase sequences. The two clones were designated 
OS17 (157 bp PCR product) and OS19 (151 bp PCR product). 

Genome walking was performed to obtain the complete coding sequence of OS 17. 
Briefly, primers for conducting genome walking in both upstream and downstream 
30 directions were designed using the portion of the 157 bp CRF2-CRR2 fragment sequence 
that was internal to the.CRF2 and CRR2 degenerate primers (OS17F1 5*-CGCTG- 
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ATATTGGCC AGTTGCTCG AAG-3 ' , SEQ ID NO:82; OS17F2 5*-CCCATCtTG- 
CTTTCCGCAAGATTGAGC-3', SEQ ID NO:83; OSI7F3 S'-CAATCGGCCTGCCGA- 
ATAACGCCCATCT-3', SEQ ID NO:84; OS17Rr5'-CTTCGAGCAACTGGCGAA- 
TATCAGCG-3', SEQ ID NO:85; OS17R2 S'-GCTCAATCTTGCGGAAAGCAAG- 
5 ATGGG-3', SEQ ID NO:86; and OS17R3 5'-AGATGGGCGTTATTCGGCAGGGCC- 
ATTG-3', SEQ IDNO:87). Hie OS17F1, OS17F3, and OS17F2 primers face 
downstream, while the OS47R2» OS17R3, and OS17Rl^)rimers faceAipsteeam. 

Genome walking was conducted using the Universal Genome Walking kit 
(ClonTech Laboratories, Inc., Palo Alto, CA) with the exception that additional libraries 
10 were generated with enzymes Nru I, Fsp I, and Hinc II. The first round PCR was 

conducted in a Perkin Elmer 2400 Thermocycler with 7 cycles of 2 seconds at 94°C and 3 
minutes at 72°e, and 36 cycles of 2 seconds at 94°eand3 minutes at€6°e with a final 
extension at 66°C for 4 minutes. Second round PCR used 5 cycles of 2 seconds at 94°C 
-and-3 -minutes at 72°C, and 20 cycles-of 2 seconds-at-94?C and 3 minutes at 66°C with a 
15 final extension at 66°C for 4 minutes. The first and second round amplification product 
(5 nL) was separated by gel electrophoresis on a 1% TAE agarose gel. After the second 
round PCR, an amplification product of about 0.4 kb was obtained with the Fsp I library 
using the OS17R1 primer in the reverse direction, and an amplification product of about 
JL6Jcb was obtamed-wiA the^mcJIJibrary using the OS17E2 primer in the forward 
20 direction. These PCR products were cloned and sequenced. 

Sequence analysis revealed that the sequences derived from genome walking 
overlapped with the CRF2-CRR2 fragment and shared sequence similarity with crotonase 
and hydratase sequences. 

A second genome walking was performed to obtain additional sequences. Six 
25 primers were designed for this second genome walk (OS17F4 5VAAGCTGGG- 

TCTGATCGATGCCATTGCTACC-3 SEQ ID NO:88; OS17F5 5'-CTCGATTATCG- 
CCCATCCACGTATCGAG-3', SEQ ID NO:89; OS17F6 5'-TGGATGCAATCCG- 
CTATGGCATTATCCACG-3', SEQ ID NO:90; OS17R4 5'-TCATTCAGTGCG- 
TTC ACCGGCGG ATTTGTC-3 ' , SEQE)NO:91; OS17R5 5'-TCGATCCGGAAGT- 
30 AGCGATAGCGTTCGATG-3', SEQ ID NO:92; and OS17R6 5'-CTTGGCTGCAAT- 
CTCTTCGAGCACTTCAGG-3', SEQ ID NO:93). TheOS17F4, OS17F5, and OS17F6 
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primers faced downstream, while theOS17R4, OS17R5, and OS17R6 primers faced 
upstream. 

The second genome walk was performed using the same methods described for 
the first genome walk. After the second round of walking, an amplification product of 
5 about 2.3 kb was obtained with a Hinc //library using the OS17R5 primer in the reverse 
direction, and an amplification product of about 0.6 kb was obtained with &PvuU library 
using the OS17F5 primer in the forward direction. The PCR products were cloned and 
sequenced. Sequence analysis revealed that the sequences derived from the second 
genome walking overlapped with the sequence obtained during the first genome walking. 
10 In addition, the sequence analysis revealed a sequence with 3572 bp. 

A BLAST search revealed that the polypeptide encoded by this sequence shares 
sequence similarity with polypeptides having three different activities. Specifically, the 
beginning of the OS 17 encoded-polypeptide shares-sequence similarity with CoA- 
synthesases, the middle region of the OS17 encoded-polypeptide shares sequence 
15 similarity with enoyl-CoA hydratases, and the end region of the OS17 encoded- 
polypeptide shares sequence similarity with CoA-reductases. 

A third genome walk was performed using four primers {OS 17UP-6 5'- 
C ATC AG AGGTAATC ACC ACTGGTGC A-3 ' , SEQ ID NO:94; OS17UP-7 5'- 
AAGTAGTAGGCCACCTCGTCGCCATA-3', SEQ ID NO:95; OS17DN-1 5*- 
20 GCCAATCAGGCGCTGATCTATGTTCT-3 ', SEQ ID NO:96; and OS17DN-2 5'- 
CTG ATCTATGTTCTGGCCTCGGAGGT-3 ' , SEQ ID NO:97). The OS17UP-6 and 
OS17UP-7 primers face upstream, while the OS17DN-1 and OS17DN-2 primers face 
downstream. The third genome walk yielded an amplification product of about 1 2 kb 
with a Nru I library using the OS17UP-7 primer in the reverse direction. In addition, 
25 amplification products of about 4 kb and about 1 .1 kb were obtained with a Hinc II and 
Fsp I library, respectively, using the OS17DN-2 primer in the forward direction. 
Sequence analysis revealed a nucleic acid sequence encoding a polypeptide (Figures 27- 
28). The complete OS 17 gene had"5466 nucleotides and encoded a 1822 amino acid 
polypeptide. The calculated molecular weight of the OS17 polypeptide from the 
30 sequence was -201, 346 (pI^SJl). 
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A BLAST search analysis revealed that the product of the OS17 nucleic acid has 
three different activities based on sequence similarity to <1) CoA-synthesases at the 
beginning of the OS 17 sequence, <2) 3-HP dehydratases in the middle of the OS 17 
sequence, and (3) Co A-reductases at the end of the OS 17 sequence. Thus, the OS 17 

5 clone appeared to encode a single enzyme capable of catalyzing three distinct reactions 
leading to the direct conversion of 3-hydroxypropionate to propionyl CoA: 3-HP-> 3-HP- 
CoA-* acrylyl-CoA^propionyl-CoA. 

The OS 1 7 gene from C. aurantiacus was PCR amplified torn chromosomal DNA 
using the following conditions: 94°C for 3 minutes; 25 cycles of 94°C for 30 seconds to 

10 denature, 54°C for 30 seconds to anneal, and 68°C for 6 minutes for extension; followed 
by 68°C for 10 minutes for final extension. Two primers were used (OS 17F 5*- 
OGGAATTCCATATGATCGACACTGCG-3 > , SEQ ID NO:136; and OS17R 5'- 
CGAAGGATCCAACGATAATCGGCTCAGCAC-J , > SEQ ID NO:137), The resulting 
PCR product<~5.6 Kb) was purified using Qiagen PCR purification kit (Qiagen Inc., 

1 5 Valencia, CA). The purified product was digested with Ndel and BamHI restriction 

enzymes, heated at 80°C for 20 minutes to inactivate the enzymes; purified using Qiagen 
PCR purification kit, and ligated into a pETl 1 a vector (Novagen, Madison, WI) 
previously digested with Ndel and BamHI enzymes. The ligation reason was 
transformed into NovaBlue chemically competent cells (Novagen, Madison, WI) that 

20 were spread on LB agar plates supplemented with SO jig/mL carbenicillin. Individual 
transfonnants were screened by PCR amplification of the OS17 DNA with the OS17F 
and QS.17R primers and conditions as described above directly from colonies cells. 
Clones that yielded the 5.6 Kb product were used for plasmid purification with Qiagen 
QiaPrep Spin Miniprep Kit (Qiagen, Inc). Resulting plasmids were transformed into £ 

25 colt BL21(DE3) cells, and OS 17 polypeptide expression induced. The apparent 

molecular weight of the OS17 polypeptide according to SDS gel electrophoresis was 
about 190,000 Da. 

To assay OS17 polypeptide function, a 100 mL culture of BL21-DE3/pETlla- 
OS17 cells was started using 1 mL of overnight grown culture as an inoculum. The 
30 culture was grown to an OD of 0:5-0.6 and was induced with 1 00 jiM IPTG. After two 
and a half hours of induction, the cells were harvested by spinning at 8000 rpm in the 
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floor centrifuge. The cells were washed with 10 mM Tris-HCl{pH 7.8) and passed twice 
through a French Press at a gauge pressure of 1000 psi. The cell debris was removed by 
centrifugation at 1 5,000 rpm. The activity of the OS17 polypeptide was measured 
spectrophotometrically, and the products formed during this enzymatic transformation 
5 were detected by LC/MS. The assay mix was as follows (J. BacterioL, 181 -.1088-1098 
{1999)): 

Reagent Volume Final Cone. 



Tris-HCl (1000 mM, 7.8 pH) 


10 pX 


50 mM 


MgCl 2 (lOOmM) 


10 mL 


5 mM 


ATP<30mM) 


20 nL 


3 mM 


KCKlOOmM) 


20 nL 


10 mM 


CoASH<5 mM) 


20 mL 


0.5 mM 


NAD(P)H 


20 nL 


0.5 mM 


3 -hydroxypropionale 


2|iL 


1 mM 


Protein extract (7 mg/mL) 


20(40)nL 


140 ng 


DI water 


78 (58) mL 




Total 


200 pL 





20 The initial rate of reaction was measured by monitoring the disappearance of 

NAD(P)H at 340 am. The activity of the OS17 polypeptide was measured using 3-HP as 
Hie substrate. The units/mL of total protein was calculated using the formula set forth in 
Example 1. The activity of the expressed OS17 polypeptide was calculated to be 0.061 
U/mL of total protein. The reaction products were purified using a Sep Pak Vac column 

25 (Waters). The column was conditioned with 1 mL methanol and washed two times with 
0.5 mL 0.1% TFA. The sample was then applied to the column, and the column was 
washed two more times with 0.5 mL 0.1% TFA. The sample was eluted with 200 pL of 
40% acetonitrile, 0.1% TFA. The acetonitrile was removed from the sample by vacuum 
centrifugation. The reaction products were analyzed by LC/MS. 

30 Analyses of thioesters namely propionyl CoA, acrylyl Co A, and 3 HP CoA from 

the above reaction were carried out using a Waters/Micromass ZQ LC/MS instrument 
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which had a Waters 2590 liquid chromatograph with a Waters 996 Photo-Diode Array 
(PDA) placed in series between the chromatograph and the single quadropole mass 
spectrometer. LC separations were made using a 4.6 x 150 mm YMC ods-AtK3 V*n 
particles, 120 A pores) reversed-phase chromatography column at room temperature. 
5 CoA esters were eluted in Buffer A (25 mM ammonium acetate, 0.5% acetic acid) with a 
linear gradient of buffer B (acetonitrile, 0.5% acetic acid). A flow rate of 0.25 mL/niinute 
was used, and photodiode array UV absorbance was monitored from 200 to 400 nm. All 
parameters of the electrospray MS system were optimized and selected based on 
generation of protonated molecular ions ({M+H]*) of the analytes of interest and 
10 production of characteristic fragment ions. The following instrumental parameters were 
used for ESI-MS detection of CoA and organic acid-CoA thioesters in the positive ion 
mode; Extractor. 1 V; RF lens: 0 V; Source temperature: 100°C; Desolvation 
temperature: 300°C; Desolvation gas: 500 L/hour; Cone gas: 40 L/hour, Low mass 
resolution: 13 0; High mass resolution: 14.5; Ion energy; 0.5; Multiplier. 650. 
15 Uncertainties for mass charge ratios (m/z) and molecular masses are ± 0.01%. 

The enzyme assay mix from strains expressing the OS17 polypeptide exhibited 
peaks for propionyl CoA, acrylyl CoA and 3-HP CoA with the propionyl CoA peak 
being the dominant peak. These peaks where missing jn the enzyme assay mix obtained 
from the control strain, which carried vector pETl la without an insert These results 
20 indicate that the OS17 polypeptide has CoA synthetase activity, CoA hydratase activity, 
and dehydrogenase activity. 

Genome walking also was performed to obtain the complete coding sequence of 
OS 19. Briefly, primers for conducting genome walking in both upstream and 
downstream directions were designed using the portion of the 151 bp CRF2-CRR2 
25 fragment sequence that was internal to the CRF2 and CRR2 degenerate primers (OS19F1 
5'-GGCTGATATCAAAGCGATGGCCAATGC-3\ SEQ ID NO:98; OS19F25'-CCAC 
GCCTATTG ATATGCTC ACCAGTG-3 ' , SEQ ID NO:99; OS19F3 S'-GCAAACCGG- 
TGATTGCTGCCGTGAATGG-3 ' , SEQ ID NO:100; OS19R1 5*-GCATTGGCCAT- 
CGCTTTG ATATC AGCC-3 ' , SEQ ID NO:101; OS19R2 5'-CACTGGTGAGCATATC- 
30 AATAGX3CGTGG-3\^EQn)M):102;andOS19R3 5'-CCATTCACGGCAGCAA- 
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TCACCGGTTTGC-3\ SEQ ID NO:103). The 0S19F1, OS19F2, and OS19F3 primeis 
face downstream, while the OS19R1, OS19R2, and OS19R3 primers face upstream. 

An amplification product of about 0.25 kb was obtained with the Fsp I library 
using the OS19R1 primer, while an amplification product of about 0.65 kb was obtained 
5 with the Pvu II library using the OS19R1 primer. In addition, an amplification product of 
about 0.4 kb was obtained with the Pvu U library using the OS19F3 primer. The PCR 
products were cloned and sequenced. Sequence analysis revealed that the sequences 
derived from genome walking overlapped with the CRF2-CRR2 fragment and shared 
sequence similarity with crotonase and hydratase sequences. The obtained sequences 
1 0 accounted for most of the coding sequence including the start codon. 

A second genome walk was performed to obtain additional sequence using two 
primers (OS19F7 5'-TCATCATGGCCAGTGAAAACGCGCAGTTCG-3', SEQ ID 
NO:104 and OS19F8 5'-GGATCGCGCAAACCATTGCCACCAAATCAC-3\ SEQ ID 
NO:105). The OS19F7 and OS19F8 primers face downstream. 
15 An amplification product (about 0.7 kb) obtained from die Pvu U library was 

cloned and sequenced. Sequence analysis revealed that the sequence derived from the 
second genome walk overlapped with the sequence obtained from the first genome walk 
and contained the stop codon. The full-length OS 19 clone was found to share sequence 
similarity with other sequences such as crotonase and enoyl-CoA hydratase sequences 
20 (Figures 32-33). 

The OS 19 clone was found to encode a polypeptide having 3-hydroxypropionyl- 
CoA dehydratase activity also referred to as acrylyl-CoA hydratase activity. The nucleic 
acid encoding the OS19 dehydratase from C. aurantiacus was PCR amplified from 
chromosomal DNA using the following conditions: 94°C for 3 minutes; 25 cycles of 
25 94°C for 30 seconds to denature, 56°C for 30 seconds to anneal, and 68°C for 1 minute 
for extension; and 68°C for 5 minutes for final extension. Two primers were used 
(OSACH3 5'-ATGAGTGAAGAGTCTCTGGTTCTCAGC-3 \ SEQ ID NO:106 and 
OSACH2 5'-AGATCGCAATCGCTCGTGTATGTC-3\SEQ ID NO:107). 
The resulting PCR product (about 1.2 kb) was separated by agarose gel 
30 electrophoresis and purified using Qiagen PCR purification kit (Qiagen Inc.; Valencia, 
CA). The purified product was ligated into pETBlue-1 using the Perfectly Blunt cloning 
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Kit (Novagen; Madison, WI). The ligation reaction was transformed into NovaBlue 
chemically competent cells (Novagen, Madison, WI) that then were spread on LB agar 
plates supplemented with"50 ug/mL carbenicillin, 40 ug/mL IPTG, and 40 ug/mL X-Gal. 
White colonies were isolated and screened for the presence of inserts by restriction 
5 mapping. Isolates with the correct restriction pattern were sequenced from each end 

using the primer pETBlueUP and pETBlueDOWN (Novagen) to confirm the sequence at 
the ligation points. - 

The plasmid containing the OS19 dehydratase-encoding sequence was 
transformed into Tuner (DE3) pLacI chemically competent cells (Novagen, Madison, 
10 WI), and expression from the construct tested. Briefly, a culture was grown overnight to 
saturation and diluted 1 :20 the following morning in fresh LB medium with the 
appropriate antibiotics. The culture was grown at 37°eand 250 rpm to an ODsoo of about 
0.6. At this point, the culture was induced with IPTG at a final concentration of 1 mM. 
-The-Gulture wasin<aibated4b*ai^ditk>i^^ 
15 were taken pre-induction and 2 hours post-induction for SDS-PAGE analysis. A band of 
the expected molecular weight (27,336 Daltons predicted from the sequence) was 
observed. This band was not observed in cells containing a plasmid lacking the nucleic 
acid encoding the hydratase. 

Cell frff -rrtrmf were pr*p»«»rf Jay^powinexells-asuiescrib^ cells 
20 were harvested by centrifiigation and disrupted by sonication. The sonicated cell 

suspension was centrifuged to remove cell debris, and the supernatant was used in the 
assays. Thejbjlity_olJhe_i^droxypr.QP^ dehydratase to perform the following 
three reactions was measured using MALDI-TOF MS: 
1) acrylyl-CoA -> 3-hydroxypropionyl-CoA 
25 2) 3-hydroxypropionyl-CoA -» acrylyl-CoA 

3) crotonyl-CoA -» 3 -hy droxybutyryl-Co A 

The assay mixture contained 50 mM Tris-HCl (pH 7.5), 1 mM CoA ester, and 
about 1 ug cell free extract Reactions were allowed to proceed at room temperature and 
30 were stopped by adding 1 volume 10% trifluroacetic acid (TFA). The reaction mixtures 
were purified prior to MALDI-TOF MS analysis using Sep Pak Vac d 8 50 mg columns 
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(Waters, Inc.). The columns were conditioned with 1 mL methanol and then equilibrated 
with two washes of 1 mL 0.1% TFA. The sample was applied to the column, and the 
flow through was discarded. The column was washed twice with 1 mL 0.1% TFA. The 
sample was eluted in -200 \sL 40% acetonitrile, 0.1% TFA. The acetonitrile was removed 
5 by centrifiigation in vacua. Samples were prepared for MALDI-TOF MS analysis by 
mi xing 1 :1 with 1 10 mM sinapinic acid in 0.1% TFA, 67% acetonitrile. The samples 
were allowed to air dry. 

The conversion of acrylyl-CoA into 3-hydroxypropionyl-CoA catalyzed by the 3- 
hydroxypropionyl-CoA dehydratase was detected using the MALDI-TOF MS technique. 
10 In reaction #1 , the control sample exhibited a dominant peak at a molecular weight 
corresponding to aerylyl-CoA (MW 823). The reaction #1 sample containing the cell 
extract from cells transfected with the 3 -hy droxypropiony 1-Co A dehydratase-encoding 
plasmid exhibited a dominant peak corresponding to 3-hydroxypropionyl-CoA (MW 
841). This result demonstrates that the 3-hydroxypropionyl-CoA dehydratase activity 
15 catalyzes reaction #1. 

To detect the conversion of 3-hydroxypropionyl-CoA into acrylyi-CoA, reaction 
#2 was carried out in 80% D2O. The reaction #2 sample containing the cell extract from 
cells transfected with the 3-hydroxypropionyl-CoA dehydratase-encoding plasmid 
revealed incorporation of deuterium in the 3-hydroxypropionyl-CoA molecule. This 
20 result indicates that the 3-hydroxypropionyl-CoA dehydratase enzyme catalyzes reaction 
#2. In addition, the results from both #1 and #2 reactions indicate that the 3- 
hydroxypropionyl-CoA dehydratase enzyme can catalyze the 3-hydroxypropinyl-CoA 

> acryiyl^oA reaction in both ^ directions." It is noted that for broth the #1 and #2 
reactions, a peak was observed at MW 81 1, due to leftover acetyl-CoA from the synthesis 
25 of 3-hydroxypropionyl-CoA from 3-hydroxypropionate and acetyl-CoA. 

The assays assessing conversion of crotonyl-CoA into 3-hydroxybutyryl-CoA also 
were carried out in 80% D2O. In reaction #3, the control sample exhibited a dominant 
peak at a molecular weight corresponding to crotonyKJoA (MW 837). This result 
indicated that the crotonyl-CoA was not converted into other products. The reaction #3 
30 sample containing the cell extract from cells transfected with the 3-hydroxypropionyl- 
CoA dehydratase-encoding plasmid exhibited a diffuse group of peaks corresponding to 
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deuterated 3-hydroxybutyryl-CoA(MW 855 to MW 857). This result demonstrates that 
the 3-hydroxypropionyl-CoA dehydratase activity catalyzes reaction #3 . 

A series of control reactions were performed to confirm the specificity of the 3- 
hydroxypropionyl-CoA dehydratase. Lactyl-CoA (1 mM) was added to the reaction 
5 mixture containing 100 mM Tris<pH 7.0) both in the presence and the absence of the 3- 
hydroxypropionyl-CoA dehydratase. In both cases, the dominant peak observed had a 
molecular weight corresponding to lactyl-CoA (MW 841). This result indicates that 
lactyl-CoA is not affected by the presence of 3-hydroxypropionyl-CoA dehydratase 
activity even in the presence of D2O meaning that the 3-hydroxypropionyl-CoA 
10 dehydratase enzyme does not attach a hydroxyl group at the alpha carbon position. The 
presence of 3-hydroxypropionyl-CoA in an 80% D2O reaction mixture resulted in a shift 
upon addition of the 3-hydroxypropionyl-CoA dehydratase activity. In the absence of 3- 
hydroxypropionyl-CoA dehydratase activity, a peak corresponding to 3- 
hydroxypropionyl-CoA was observed in addition to a peak of MW 811. The MW 81 1 
1 5 peak was due to leftover acetyl-CoA from the synthesis of 3-hydroxypropionyl-CoA. In 
the presence of 3-hydroxypropionyl-CoA dehydratase activity, a peak corresponding to 
deuterated 3-hydroxypropionyl-CoA was observed (MW 842) due to exchange of a 
hydroxyl group during the conversion of 3-hydroxypropionyl-CoA to acrylyl-CoA and 
visa-versa. These control reactions demonstrate that the 3-hydroxypropionyl-CoA 
20 dehydratase enzyme is active on 3-hydroxypropionyl-CoA and not active on lactyl-CoA. 
In addition, these results demonstrate that the product of the acrylyl-CoA reaction is 3- 
hydroxypropionyl-CoA not lactyl-CoA 

Example 4 - Construction of operon #1 
25 The following operon was constructed and can be used to produce 3-HP in E. coli 

{Figure 34). Briefly, the operon was cloned into a pET-1 1 a expression vector under the 
control of a T7 promoter (Novagen, Madison, WI). The pET-1 la expression vector is a 
5677 bp plasmid that uses the ATG sequence of an Ndel restriction site as a start codon 
for inserted downstream sequences. 
30 Nucleic acid molecules encoding a CoA transferase and a lactyl-CoA dehydratase 

were amplified from Megasphaera elsdenii genomic DNA by PCR. Two primers were 
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used to amplify theCoA transferase-encoding sequence <OSNBpctF 5 ■ -GGG AATTCC- 
ATATGAGAAAAGTAGAAATCATTACAGCTG-3*, SEQ ID NO:l 08 and OSCTE-2 
5 , -GAGAGTATACACA<JTTTTCACCTCCTTTACAGCAGA<3AT-3\ SEQ ID 
NO: 109), and two primers were used to amplify the lactyl-CoA dehydratase-encoding 

5 sequence (OSCTE-1 S'-ATCTCTGCTGTAAAGGAGGTCAAAACTGTGTATACT- 
CTC-3*, SEQ ID NO:l 10 and OSEBH^^'-ACGTTGATCTCCTTGTACATT- 
AGAGGAmCCGAGA^GC-3 , > SEQ ID NO:l 1 1). A nucleic acid molecule 
encoding a 3-hydfbxypropionyl-CoA dehydratase was amplified from Chloroflexus 
aurantiacus genomic DNA of by PCR using two primers (OSEBH-1 5 '-GCTTTCTCGG- 

10 AAATCCTCT AATGTAC AAGG AG ATC AACGT-3 * , SEQ ID NO:l 12 and OSHBR5*- 
CGACGGATCCTCAACGACCACTGAAGTTGG-3 *, SEQ ID NO:l 13). 

PCR was conducted in a Perkin Elmer 2400 Thennocycler using 100 ng of 
genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
Pfu Turbo polymerase {Stratagene; La Jolla, CA) in 8: h ratio. The polymerase mix 

15 ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 
for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
obtained PCR products were gel purified using a Qiagen Gel Extraction Kit (Qiagen, Inc.; 
Valencia, CA). 

20 ' The CoA transferase, lactyl-CoA dehydratase <E1, E2 a subunit, and E2 P 

subunit), and 3-hydroxypropionyl-CoA dehydratase PCR products were assembled using 
PCR. The OSCTE-1 and OSCTE-2 primers as well as the OSEBH-1 and OSEBH-2 
primers were complementary to each other. Thus, the complementary DNA ends could 
anneal to each other during the PCR reaction extending the DNA in both direction. To 

25 ensure the efficiency of the assembly, two end primers (OSNBpctF and OSHBR) were 
added to the assembly PCR mixture, which contained 100 ng of each PCR product (Le., 
the PCR products from the CoA-transferase, lactyUCoA dehydratase, and 3- 
hydroxypropionyl-CoA dehydratase reactions) as well as the rTth polymerase/Pfu Turbo 
polymerase mix described above. The following PCR conditions were used to assemble 

30 the products: 94°C for 1 minute; 25 cycles of 94°C for 30 seconds, 55°Cfor 30 seconds, 
and 68°C for 6 minutes; and a final extension at 68°C for 7 minutes. The assembled PCR 
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product was gel purified and digested with restriction enzymes (Ndel and BamHI). The 
sites for these restriction enzymes were introduced into the assembled PCR product using 
the OSNBpctF (Ndel) and OSHBR (BamHI) primers. The digested PCR product was 
heated at 80°C for 30 minutes to inactive the restriction enzymes and used directly for 
5 ligation into pET-1 1 a vector. 

The pET-1 1 a vector was digested with Ndel and BamHI restriction enzymes, gel 
purified using a QiagenGel Extraction kit, treated with shrimp alkaline phosphatase 
(Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
assembled PCR product The ligation was performed at 16°C overnight using T4 ligase 

10 (Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WI) using a 
heat-shock method. Once heat shocked, the cells were plated on LB plates supplemented 
with 50 jig/mL carbenicillin. The plasmid DNA was purified from individual colonies 
using a QiaPrep Spin Miniprep Kit<Qiagen Inc., Valencia, CA) and analyzed by 

15 digestion with Ndel and BamHI restriction enzymes. 

Example 5 - Construction of operon #2 

- The following-operon was constructed and can be used to produce 3-HP in E. coli 

(Figure 35A and B). Nucleic acid molecules encoding a CoA transferase and a lactyl- 

20 CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by PCR. 
Two primers were used to amplify the CoA transferase-encoding sequence {OSNBpctF 
and OSCTE-2), and two primers were used to amplify the lactyl-Co A dehydratase- 
encoding sequence (OSCTE-1 and OSNBelR 5 '-CGACGGATCCTTAGAGGATTT- 
CCGAGAAAGC-3*, SEQ ID NO:l 14). A nucleic acid molecule encoding a 3- 

25 hydroxypropionyl-CoA dehydratase was amplified from Chloroflexus aurantiacus 
genomic DNA of by PCR using two primers (OSXNhF 5'-GGTGTCT- 
AGAGACAGTCCTGTCGTTTATGTAGAAGG AG-3 SEQ ID NO:115 and OSXNhR 
5 ' -GGG AATTCC ATATGCGTAACTTCCTCCTGCTATCAACGACC ACTGAA- 
GTTGG-3\ SEQ ID NO:l 16). 

30 PCR was conducted in a Perkin Elmer 2400 Thermocycler using 1 00 ng of 

genomic DNA and a mix of rTth polymerase <Applied Biosystems; Foster City, CA) and 
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Pfu Turbo polymerase (Stratagene; La Jolla, CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturationstep of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 
for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
5 obtained PCR products were gel purified using a Qiagen Gel Extraction Kit<Qiagen, Inc.; 
Valencia, CA). 

The CoA transferase and lactyi-CoA dehydratase (El, E2 a subunit, and £2 p 
subunit) PCR products were assembled using PCR. The OSCTE-1 and OSCTE-2 primers 
were complementary to each other. Thus, the 22 nucleotides at the end of the CoA 

10 transferase sequence and the 22 nucleotides at the beginning of the lactyl-CoA 

dehydratase could anneal to each other during the PCR reaction extending the DNA in 
both direction. To-ensure the efficiency of the assembly, two end primers <OSNBpctF 
and OSNBelR) were added to the assembly PCR mixture, which contained 100 ng of the 
CoA transferase PCR product, 100 ng of lactyl-CoA dehydratase PCR product, and the 

15 rTth polymerase/Pfu Turbo polymerase mix described above. The following PCR 

conditions were used to assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 
30 seconds, 54°C for 30 seconds, and 68°C for 5 minutes; and a final extension at 68°C 
for 6 minutes. 

The assembled PCR product was gel purified and digested with restriction 
20 enzymes {Ndel and BamHI). The sites for these restriction enzymes were introduced into 
the assembled PCR product using the OSNBpctF {Ndel) and OSNBelR {BamHI) primers. 
The digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pET-1 la vector. 

The pET-1 la vector was digested with Ndel and BamHI restriction enzymes, gel 
25 purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 

(Roche Molecular Biochemkals; Indianapolis, IN) and used in a ligation reaction with the 
assembled PCR product The ligation was performed at 16°C overnight using T4 ligase 
(Roche Molecular Biochemkals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WT) using a 
30 heat-shock method. Once heat shocked, the cells were plated on LB plates supplemented 
with 50 ug/mL carbenicillin. The plasmid DNA was purified from individual colonies 
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using a QiaPrep Spin Miniprep Kit (Qiagen Inc., Valencia, CA) and analyzed by 
digestion with Ndel and BamHI restriction enzymes. The digest revealed that the DNA 
fragment containing CoA transferase-encoding and lactyl-GoA dehydratase-encoding 
sequences was cloned into the pET-1 la vector. 

5 The plasmid carrying the CoA transferase-encoding and lactyl-CoA dehydratase- 

encoding sequences (pTD) was digested vnthXbal and Ndel restriction enzymes, gel 
purified^and used for cloning the 3-hydroxypropionyl-CoA dehydratase-encoding 
product upstream of the CoA transferase-encoding sequence. Since this Xbal and Ndel 
digest eliminated a ribosome-binding site (RBS) from the pET-1 la vector, a new 

10 homologous RBS was cloned into the plasmid together with the 3-hydroxypropionyl-CoA 
dehydratase-encoding product Briefly, the 3-hydroxypropionyl-CoA dehydratase- 
encoding PGR-produc*^^ 

65°C for 30 minutes to inactivate the restriction enzymes, and ligated into pTD. The 
^Jigationjnixture was trancfonr"** into rheiniraltyxiompel^ 
1 5 that then were plated on LB plates supplemented with 50 pg/mL carbenicillin. 

Individual colonies were selected, and the plasmid DNA obtained using a Qiagen 
Spin Miniprep Kit. The obtained plasmids were digested vnihXbal and Ndel restriction 
enzymes and analyzed by gel electrophoresis. pTD plasmids containing the inserted 3- 
^yj^yyprnpinpyl-rinA H^ yHratflgft^hmHing PHR ^ mri n rt Wfi re n am e d pHIP. While 
20 expression of the lactyl-Co A hydratase, CoA transferase, and 3-hydroxypropionyl-CoA 
dehydratase sequences from pHTD was directed by a single T7 promoter, each coding 
sequence had an mdiyidualJUBS upstream of their start codon. 

To ensure the correct assembly and cloning of the lactyl-CoA hydratase, CoA 
transferase, and 3-hydroxypropionyl-CoA dehydratase sequences into one operon, both 
25 ends of the operon and all junctions between the coding sequences were sequenced. This 
DNA analysis revealed that the operon was assembled correctly. 

The pHTD plasmid was transformed into BL21(DE3) cells to study the expression 
of the encoded sequences. 
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Example 6 - Construction of operons #3 and #4 
Operon #3 (Figure 36A and B) and operon #4 (Figure 37A and B) each position 
the El activator at the end of the operon. Operon #3 contains a RBS between the 3- 
hydroxypropionyl-CoA dehydratase-encoding sequence and the El activator-encoding 
5 sequence. In operon #4, however, the stop codon of the 3-hydroxypropionyl-CoA 

dehydratase-encoding sequence is fused with the start codon of the El activator-encoding 
sequence as follows: TAGTG. The absence of the RBS in operon #4 can decrease the 
level of El activator expression. 

To construct operon #3, nucleic acid molecules encoding a CoA transferase and a 
10 lactyl-CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by 
PCR. Two primers were used to amplify the CoA transferase-encoding sequence 
<OSNBpctF and OSHTR S'-ACGTTCATCTCOTCT 

CCCATG-3', SEQ ID NO:l 17), two primers wereiised to amplify the E2 a and p 
subunits of the lactyl-CoA dehydratase-encoding sequence (OSEHXNF 5'- 

15 GGTGTCTAGAGTCAAAGGAGAGAACAAAATCATGAGTG-S^^EQIDNOillS 
and OSEHXNR 5 '-GGGAATTCCATATGCGTAACTTCCTCCTGCTATTAGAGGA- 
TTTCCG AG AAAGC-3 * , SEQ ID NO:l 19), and two primers were used to amplify the El 
activator of the lactyl-CoA dehydratase-encoding sequence (OSHrEIF 5*-TCAGTG- 
GTCGTTGATCACGCTATAAAGAAAGGTGAAAACTGTGTATACTCTC-3 \ SEQ 

20 ID NO:120 and OSEroR5'-(^ACGGATCCCTTCCTTGGAGCTCATGCTTTC-3', 
SEQ ID NO:121). A nucleic acid molecule encoding a 3-hydroxypropionyl-CoA 
dehydratase was amplified from CMoroflexus aurantiacus genomic DNA of by PCR 
using two primers (OSTHF 5 1 -C ATGGGACTG AAAAAATAATGT AG AAGG AG AT- 
CAACGT-3*, SEQ ID NO: 122 and OSEIrHR S'-GAGAGTATACACAGTTTTCA- 

25 CCTTTCTTTATAGCGTGATCAACGACCACTGA-3*, SEQ ID NO:123). 

PCR was conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 
genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
Pfu Turbo polymerase (Stratagene; La Jolla, CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 

30 initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 
for 30 seconds, and 68°C Tor 2 minutes; and a final extension at 68°C for 5 minutes. The 
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obtained PCR products were gel purified using a Qiagen Gel Extraction Kit <Qiagen, Inc.; 
Valencia, CA). 

The 3-hydroxypropionyl-CoA dehydratase and El activator PCR products were 
assembled using PCR. The OSHrElF and OSEIrHR primers were complementary to 

5 each other. Thus, the primers could anneal to each other during the PCR reaction 

extending the DNA in both direction. To ensure the efficiency of the assembly, two end 
primers (OSTHF and OSE1 BR) were added to the assembly PCR mixture, which 
contained 100 ng of the 3-hydroxypropionyl-CoA dehydratase PCR product, 100 ng of 
El activator PCR product, and the rTth polymerase/Pfu Turbo polymerase mix described 

10 above. The following PCR conditions were used to assemble the products: 94°C for 1 
minute; 20 cycles of 94°C for 30 seconds, 54°C for 30 seconds, and68°C for 1.5 minutes; 
and a final extension at 68°C for S minutes. 

The assembled PCR product was gel purified and used in a second assembly PCR 
with gel-purifiediheCoA transferase PCR product The OSTHF and OSHTR primers 

1 5 were complementary to each other. Thus, the complementary DNA ends could anneal to 
each other during the PCR reaction extending the DNA in bom direction. To ensure the 
efficiency of the assembly, two end primers (OSNBpctF and OSEIBR) were added to the 

second assembly.PjCRjnixture,_wmcjb^ — .._ 

hyaroxypropionyl-C^-dehydratase/EI^CR assembly, 100 ng of the purified CoA 

20 transferase PCR product, and the polymerase mix described above. The following PCR 
conditions were used to assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 
30 seconds r 54gC-for.30-seconds, an d 68°C for 3 minu tes^and a final extension at 68°C 
for 5 minutes. 

The assembled PCR product was gel purified and digested with Ndel and BcanHI 
25 restriction enzymes. The sites for these restriction enzymes were introduced into the 

assembled PCR products with the OSNBpctF (Ndel) and OSEIBR (Bamffl) primers. The 
digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pETl la vector. 

The pET-1 la vector was digested with Ndel and BamHI restriction enzymes, gel 
30 purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 

{Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
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assembled PGR product. The ligation was performed at 16°C overnight using T4 ligase 
.(Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent rails (Novagen; Madison, WI) using a 
heat-shock method. Once heat shocked, the cells were plated on LB plates supplemented 
5 with^O pg/mL carbenicillin. The plasmid DNA was purified from individual colonies 
using a QiaPrep Spin Miniprep Kit{Qiagen Inc.; Valencia, CA). The resulting plasmids 
carrying the CoA transferase, 3-hydroxypropionyl-CoA dehydratase, and EI activator 
sequences (pTHrEI) were digested with Xbal and Ndel, purified using gel electrophoresis 
and a Qiagen Gel Extraction kit, and used as a vector for cloning of the E2 a subunit/E2 P 
10 subunit PCR product. 

The E2 a subunit/E2 (J subunit PCR product was digested with the same enzymes 
and ligated into the pTHrEI vector. The ligation reaction was performed at 16°C 
overnight using T4 ligase (Roche Molecular Biochemicals; Indianapolis, IN). The 
ligation mixture was transformed into chemically competent NovaBlue cells (Novagen) 
15 that then were plated on LB plates supplemented with 50 ng/mL carbenicillin. The 

plasmid DNA was purified from individual colonies using a QiaPrep Spin Miniprep Kit 
(Qiagen Inc., Valencia, CA) and digested with Xbal and Ndel restriction enzymes for gel 
electrophoresis analysis. The resulting plasmids carrying the constructed operon #3 
(pEHTHrEI) were transformed into BL21(DE3) cells to study the expression of the 
20 cloned sequences. Electrospray mass spectrometry assay confirmed that extracts from 
these cells have CoA transferase activity and 3-hydroxypropionyl-CoA dehydratase 
activity. Similar assays are used to confirm that extracts from these cells also have lactyl- 
CoA dehydratase activity. 

To construct operon #4, nucleic acid molecules encoding a CoA transferase and a 
25 lactyl-CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by 
PCR Two primers were used to amplify the CoA transferase-encoding sequence 
(OSNBpctF and OSHTR), two primers were used to amplify the E2 a and 0 subunits of 
die lactyl-CoA dehydratase-encoding sequence (OSEHXNF and OSEIKNR), and two 
primers were used to amplify the El activator of the lactyl-CoA dehydratase-encoding 
30 sequence (OSHEIF 5 '-CCAACTTCAGTGGTCGTTAGTGAAAACTGTGTAT- 

ACTCTC-3\ SEQ ID NO: 124 and OSEIBR). A nucleic acid molecule encoding a 3- 
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hydroxypropionyl-Co A dehydratase was amplified from CMoroflexus aurantiaeus 
genomic DNA of by PCR using two primers (OSTHF and OSEIHR 5*- 
GAGAGTATAC AC AGTTTTCACTAACGACCACTG AAGTTGG-3 ' , SEQE) 
NO: 125). 

5 PCR was conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 

genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
Pfu Turbo polymerase (Stratagene; La Jolla, CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PCR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°C 

10 for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
obtained PCR products weie gel purified using a Qiagen Gel Extraction Kit (Qiagen, Inc.; 
Valencia, CA). 

The 3-hydroxypropionyl-CoA dehydrataseand El activator PCR products were 
assembled using PCR. The OSHEDF and OSEIHR primers were complementary to each 

1 5 other. Thus, the primers could anneal to each other during the PCR reaction extending 
the DNA in both direction. To ensure the efficiency of the assembly, two end primers 
(OSTHF and OSElBR) were added to the assembly PCR mixture, which contained 100 
ng of the 3-hydroxypropionyi-CoA dehydratase PCR product, 100 ng of El activator 
PCR product, and the rTth polymerase/Pfu Turbo polymerase mix described above. The 

20 following PCR conditions were used to assemble the products: 94°C for 1 minute; 20 

cycles of 94°C for 30 seconds, 54°C for 30 seconds, and 68°C for 1.5 minutes; and a final 

extension at 68°Cjqr 5 minutes. 

The assembled PCR product was gel purified and used in a second assembly PCR 
with gel purified the CoA transferase PCR product The OSTHF and OSHTR primers 

25 were complementary to each other. Thus, the complementary DNA ends could anneal to 
each other during the PCR reaction extending the DNA in both direction. To ensure the 
efficiency of the assembly, two end primers (OSNBpctF and OSEIBR) were added to the 
second assembly PCR mixture, which contained 100 ng of the purified 3- 
hydroxypropionyl-CoA dehydratase/EI PCR assembly, 100 ng of the purified CoA 

30 transferase PCR product, and the polymerase mix described above. Hie following PCR 
conditions were used to assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 
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30 seconds, 5A°C for 30 seconds, and 68°C for 3 minutes; and a final extension at 68°C 
for 5 minutes. 

The assembled PCR product was gel purified and digested with Ndel and BamHI 
restriction enzymes. The sites for these restriction enzymes were introduced into the 

5 assembled PCR products with the OSNBpctF {Ndel) and OSEBR {BamHI) primers. The 
digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pETl la vector. , 

The pET-1 la vector was digested with Ndel end BamHI restriction enzymes, gel 
purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 

1 0 (Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
assembled PCR product The ligation was performed at 16°C overnight using T4 ligase 
(Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent-cells (Novagen; Madison, WI) using a 
heat-shock method. Once shocked, the cells were plated on LB plates supplemented with 

15 50 jig/mL carbenicillin. The plasmid DNA was purified from individual colonies using a 
QiaPrep Spin Miniprep Kit (Qiagen Inc., Valencia, CA). The resulting plasmids carrying 
the CoA transferase, 3-hydroxypropionyl-CoA dehydratase, and EI activator sequences 
(pTHEl) were digested with Xbal and Ndel, purified using gel electrophoresis and a 
QiageirGelE^^ P 

20 subunit 

The £2 a subunit/E2 p subunit PCR product was digested with the same enzymes 
and.ligated into the pTHEl vector. The ligation reaction was performed at 16°C 
oveinijpi^ IN)rTHe 
ligation mixture was transformed into chemically competent NovaBlue cells (Novagen) 

25 that then were plated on LB plates supplemented with 50 jig/mL carbenicillin. The 

plasmid DNA was purified from individual colonies using a QiaPrep Spin Miniprep Kit 
(Qiagen Inc., Valencia, CA) and digested with Xbal and Ndel restriction enzymes for gel 
electrophoresis analysis. The resulting plasmids carrying the constructed operon #4 
(pEDTHEI) were transformed into BL21(DE3) cells to study the expression of the cloned 

30 sequences. Electrospray mass spectrometry assays confirmed that extracts from these 
ceils have CoA transferase activity and 3-hydroxypropionyl-CoA dehydratase activity. 
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Similar assays are used to confirm that extracts from these cells also have lactyl-GoA 
dehydratase activity. 

E. coli plasmid pEIlTHrEI carrying a synthetic 3-HP operon was digested with 
Nrul, Xbal and BamYQ. restriction enzymes, Xbal-BamYB. DNA fragment was gel purified 
5 with Quagen Gel Extraction Kit (Qiagen, Inc., Valencia CA) and used for further cloning 
into Bacillu vector P WH1520(MoBiTec BmBH, Gottingen, Germany). Vector 
pWH1520 was digested with Spel and BaniHt restriction enzymes and gel purified wilh 
Qiagen Gel Extraction Kit The Xbal-BamHl fragment carrying 3-HP operon was ligated 
into WH1520 vector at 16°C overnight using T4 ligase. The ligation mixture was 
10 transformed into chemically competent TOP 10 cells and plated on LB plates 

supplemented with 50 ug/ml carbenicillin. One clone named B. megaterium (pBP026) 
was used for assays of CoA-transferase and CoA-hydratase activities. The assays were 
performed as described above for E. Coli. The enzymatic activity was 5 U/mg and 13 
U/mg respectively. 

15 

Example 7 - Construction of a two plasmi d system 
The following constructs were constructed and can be used to produce 3-HP in E. coli 
(Figure 38A and B). Nucleic acid molecules encoding a CoA transferase and a lactyl- 
CoA dehydratase were amplified from Megasphaera elsdenii genomic DNA by PCR. 
20 Two primers were used to amplify the CoA transferase-encoding sequence (OSNBpctF 
and OSHTR), two primers were used to amplify the E2 a and p subunits of the lactyl- 
CoA dehydratase-encoding sequence <OSEIIXNF and OSEHXNR), and two primers were 
used to amplify the El activator of the lactyl-CoA dehydratase-encoding sequence 
(E1PROF 5'-GTCGCAGAATTCCCATCAATCGCAGCAATCCCAAC-3*,SEQ ED 

25 NO:126 and E1PROR S'-TAACATGXJTACCGACAGAAGCGGACCAGCA-AACGA- 
3', SEQ ID NO: 127). A nucleic acid molecule encoding a 3-hydroxypropionyl-CoA 
dehydratase was amplified from Chloroflexus aurantiacus genomic DNA of by PCR 
using two primers (OSTHF and OSHBR 5'-CGACGGATCCTCAACGACCA- 
CTGAAGTTGG-3\SEQ H>NO:128). 

30 PCR was-conducted in a Perkin Elmer 2400 Thermocycler using 100 ng of 

.genomic DNA and a mix of rTth polymerase (Applied Biosystems; Foster City, CA) and 
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Pfu Turbo polymerase (Stratagene; La Jolla,-CA) in 8:1 ratio. The polymerase mix 
ensured higher fidelity of the PCR reaction. The following PGR conditions were used: 
initial denaturation step of 94°C for 2 minutes; 20 cycles of 94°C for 30 seconds, 54°G 
for 30 seconds, and 68°C for 2 minutes; and a final extension at 68°C for 5 minutes. The 
5 obtained PCR products were gel purified using a Qiagen Gel Extraction Kit {Qiagen, Inc.; 
Valencia, CA). 

The Co A transferase PCR product and the 3-hydroxypropionyl-CoA dehydratase 
PCR product were 1 assembled using PCR. The OSTHF and OSHTR primers were 
complementary to each other. Thus, the complementary DNA ends could anneal to each 
10 other during the PCR reaction extending the DNA in both direction To ensure the 

efficiency of the assembly, two end primers (OSNBpctF and OSHBR) were added to the 
assembly PCR mixture, which contained 100 ng of the purified CoA transferase PCR 
product, 1 00 ng of the purified 3-hydroxypropionytCoA dehydratase PCR product, and 
the polymerase mix described above. The following PCR conditions were used to 
15 assemble the products: 94°C for 1 minute; 20 cycles of 94°C for 30 seconds, 54°C for 30 
seconds, and 68°C for 2.5 minutes; and a final extension at 68°C for 5 minutes. 

The assembled PCR product was gel purified and digested with Ndel and BamHI 
restriction enzymes. The sites for these restriction enzymes were introduced into the 
assembled PCR products with the OSNBpctF {Ndel) and OSHBR {BamHI) primers. The 
20 digested PCR product was heated at 80°C for 30 minutes to inactive the restriction 
enzymes and used directly for ligation into a pETl la vector. 

The pET-1 la vector was digested with Ndel and BamHI restriction enzymes, gel 
purified using a Qiagen Gel Extraction kit, treated with shrimp alkaline phosphatase 
(Roche Molecular Biochemicals; Indianapolis, IN) and used in a ligation reaction with the 
25 assembled PCR product. The ligation was performed at 16°C overnigrt using T4 ligase 
(Roche Molecular Biochemicals; Indianapolis, IN). The resulting ligation reaction was 
transformed into NovaBlue chemically competent cells (Novagen; Madison, WI) using a 
heat-shock method. Once shocked, the cells were plated on LB plates supplemented with 
50 ng/mL carbenicillin. The plasmid DNA was purified from individual colonies using a 
30 QiaPrep Spin Miniprep Kit (Qiagen Inc.; Valencia, CA) and digested with Ndel and 
BamHI restriction enzymes for gel electrophoresis analysis. The resulting plasmids 
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carrying the CoA transferase and 3-hydroxypropionyl-CoA dehydratase (pTH) were 
digested with Xbal and Ndel, purified using gel electrophoresis and a Qiagen Gel 
Extraction kit, and used as a vector for cloning of the E2 a subunit/E2 (3 subunit PCR 

product 

5 The E2 a subunit/E2 P subunit PCR product digested with the same enzymes was 

ligated into the pTH vector. The ligation reaction was performed at 16°C overnight using 
T4 ligase (Roche Molecular Biochemicals; Indianapolis, IN). The ligation mixture was 
transformed into chemically competent NovaBiue cells (Novagen) that then were plated 
on LB plates supplemented with 50 pg/mL carbenicillin. The plasmid DNA was purified 

10 from individual colonies using a QiaPrep Spin Miniprep Kit (Qiagen Inc.; Valencia, GA) 
and digested with Xbal and Ndel restriction enzymes for gel electrophoresis analysis. 
The resulting plasmids carrying the E2 a and P subunits of the lactyl-CoA dehydratase, 
the CpA transferase, and the 3-hydroxypropionyl-GoA dehydratase (pEIITH) were 
.transfonn e44ntQ^L21( DE3)c^ to s t u dy the e^ >ressi^^feecl^e4sequences. 

1 5 The gel purified El activator PCR product was digested with EcoRI and Kpnl 

restriction enzymes, heated at 65°C for 30 minutes, and ligated into a vector 
(pPROLar.A) that was digested with EcoRI and Kpnl restriction enzymes, gel purified 
using Qiagen Gel Extraction kit, and treated with shrimp alkaline phosphatase (Roche 
-Molecula^io^CTaicals^ fadianapolis^ IN). The ligation was performe d^ 1S?C 

20 overnight using T4 ligase (Roche Molecular Biochemicals; Indianapolis, IN). The 

resulting ligation reaction was transformed into DH10B electro-competent cells<Gibco 
LifeJTechnologies; Gaithersburg, MD) using electroporation. Once electroporated, the 
cells were plated on LB plates supplemented with 25 iig/mL kanamycin. The plasmid 
DNA was purified from individual colonies using a QiaPrep Spin Miniprep Kit (Qiagen 

25 Inc., Valencia, CA) and digested with EcoRI and Kpnl restriction enzymes for gel 

electrophoresis analysis. The resulting plasmids carrying the El activator (pPROEI) are 
transformed into BL21(DE3) cells to study the expression of the cloned sequence. 

The pPROEI and pEIITH plasmids are compatible plasmids that tan be used in 
the same bacterial host cell. In addition, the expression from the pPROEI and pEIITH 

30 plasmids can be induced at different lewis using IPTG and arabinose, allowing for the 
fine-tuning of the expression of the cloned sequences. 



84 



WO 02/04241! 



PCT/US01/43607 



Example 8 - Production of 3-HP 
3-HP was produced using recombinant E. coli in a small-scale batch fermentation 
reaction. The construction of strain ALS848 (also designated as TA3476<7. BacterioL* 
143:1081-1085(1980))) that carried inducible T7 RNA polymerase was performed using 

5 XDE3 lysogenization kit (Novagen, Madison, WI) according to the manufacture's 

instructions. The constructed strain was designated ALS484(DE3). Strain ALS484(DE3) 
was transformed with pEIITHrEI plasmid using standard electtoporation techniques. The 
transformants were selected on LB/carbenicillin(50 ^ig/mL) plates. A single colony was 
used to inoculate 4 mL culture in a 15 mL culture tube. Strain ALS484(DE3) strain 

10 carrying vector pETl la was used as a control. The cells were grown at 37°C and 250 
rpm in an Innova 4230 Incubator^haker (New Brunswick Scientific, Edison, NJ) for 
eight to nine hours. This culture (3 mL) was used to start an anaerobic fermentation. 
Two 100 mL anaerobic cultures of ALS(DE3)/pEf 1 la and ALS(DE3)pEIITHrEI were 
grown in serum bottles using LB media supplemented with 0.4% glucose, 50 ng/mL 

15 carbenicillin, and 100 mM MOPS. The cultures were grown overnight at 37°C without 
shaking. The overnight grown cultures were sub-cultured in serum bottles using fresh LB 
media supplemented with 0.4% glucose, 50 jig/mL carbenicillin, and 100 mM MOPS. 
The starting OD(600) of these cultures was adjusted to 0.3. These serum bottles were 
incubated at 37 b C without shaking. After one hour of incubation, the cultures were 

20 induced with 100 yM DPTG. A 3 mL sample was taken from each of the serum bottles at 
30 minutes, 1 hour, 2 hours, 4 hours, 6 hours, 8 hours, and 24 hours. The samples were 
transferred into two properly labeled 2 mL microcentrifuge tubes, each containing 1 .5 mL 
sample. The samples were spun down in a microcentrifuge centrifuge at 14000 g for 3 
minutes. The supernatant was passed through a 0.45 \i syringe filter, and die resulting 

25 filtrate was stored at -20°C until further analysis. The formation of fermentation 

products, mainly lactate and 3-hydroxypropionate^ was measured by detecting derivatized 
CoA -esters of lactate and 3-HP using LC/MS. 

The following methods were performed to convert lactate and 3-HP into their 
respective CoA esters. Briefly, the filtrates were mixed with CoA-reaction buffer (200 

30 mM potassium phosphate buffer, 2 mM acetyl-Co A, and 0. 1 mg/mL purified transferase) 
in 1:1 ratio. The reaction was allowed to proceed for 20 minutes at room temperature. 
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The reaction was stopped by adding 1 volume of 10% TFA. The sample was purified 
using Sep Pak Vac columns (Waters). The column was conditioned with methanol and 
washed two times with 0.1% TFA. The sample was then applied to the column, and the 
column was washed two more times with 0.1% TFA The sample was eluted with 40% 
5 acetonitrile, 0.1 % TFA. The acetonitrile was removed from the sample by vacuum 
centrifiigation. The samples were then analyzed by LC/MS. 

Analysis of tiie standard CdA7C6A"fhibe^ and the Co A thioester 

mixtures derived from fermentation broths were carried out using a Waters/Micromass 
ZQ LC/MS instrument which had a Waters 2690 liquid chromatograph with a Waters 996 
10 Photo-Diode Array (PDA) absorbance monitor placed in series between the 

chromatograph and the single quadrupole mass spectrometer. LC separations were made 
using ^.6~x^ reversed-phase 
chromatography column at room temperature. Two gradient elution systems were 
developed using different mobile phases for the separation of the CoA esters. These two 
15 systems are summarized in Table 3. Elution system 1 was developed to provide the most 
rapid and efficient separation of the five-component CoA/CoA thioester mixture (CoA, 
acetyl-CoA, lactyl-CoA, acrylyl-CoA, and propionyl-CoA), while elution system 2 was 
developed to provide baseline separation of the structurally isomeric esters lactyl-CoA 
" and 3M P -C oA in addition to separation of the remaining esters hstedabove: In all cases, 
20 the flow rate was 0.250 mL/minute, and photodiode array XJV absorbance was monitored 
from 200 nm to 400 nm. All parameters of the electrospray MS system were optimized 
and-selected*ased on-generation of protonated molecular ions flM + H]*) of the analytes 
of interest and production of characteristic fragment ions. The following instrumental 
parameters were used for ESI-MS detection of CoA and organic acid-CoA thioesters in 
25 • the positive ion mode: Capillary: 4.0 V; Cone: 56 V; Extractor: 1 V; RF lens: 0 V; Source 
temperature: 100°C; Desolvation temperature: 300°C; Desolvation gas: 500 L/hour, Cone 
gas: 40 L/hour; Low mass resolution: 13.0; High mass resolution: 14.5; Ion energy: 0.5; 
Multiplier: 650. Uncertainties for reported mass/charge ratios (m/z) and molecular 
masses are ± 0.01%. Table 3 provides a summary of gradient elution systems for the 
30 separation of organic acid-Coenzyme A thioesters. 
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The following methods were used to separate the derivatized 3-hydroxypropionyl- 
CoA, which was formed fiom 3-HP, from 2-hydroxypropionyl-CoA (i.e., lactyl-CoA), 

5 which was formed from lactate. Because these structural isomers have identical masses 
and mass spectral fragmentation behavior, the separation and identification of these 
analytes in a mixture depends on their chromatographic separation. While elution system 
1 provided excellent separation of the CoA thioesters tested (Figure 46), it was unable to 
resolve 3 -HP-Co A and lactyl-CoA. An alternative LC elution system was developed 

10 using ammonium acetate and triethylamine (system 2; Table 3). 

The ability of system 2 to separate 3-HP-CoA and lactyi-CoA was tested on a 
mixture of these two compounds. Comparing the results from a mixture of 3 -HP-CoA 
and lactyl-CoA {Figure 47, Panel A) to the results fiom lactyl-CoA only (Figure 47, Panel 
B) revealed that system 2 can separate 3-HP-CoA and lactyl-CoA. The mass spectrum 

15 recorded under peak 1 (Figure 47, Panel A insert) was used to identify peak 1 as being a 
hydroxypropionyl-CoA thioester when compared to Figure 46, Panel C. in addition, 
comparison of Panels A and B of Figure 47 as well as the mass spectra results 
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corresponding to each peak revealed that peak 1 corresponds to 3-HP-CoA and peak 2 
corresponds to lactyl-CoA. 

System 2 was used to confirm that E. coli transfected with pEHTHrEI produced 3- 
HP in that 3 -HP-Co A was detected. Specifically, an ion chromatogram for m/z = 840 in 

5 the analysis of a Co A transferase-treated fermentation broth aliquot collected from a 
culture of E. coli containing pEHTHrEI revealed the presence of 3-HP-CoA (Figure 48, 
Panel A). The CoA transferase-treated fermentation broth aliquot collected from a 
culture of E. coli lacking pEHTHrEI did not exhibit the peak corresponding to 3-HP-CoA 
(Figure 48, Panel B). Thus, these results indicate that the pEHTHrEI plasmid directs the 

10 expression of polypeptides having propionyl-CoA transferase activity, lactyl-CoA 

dehydratase activity, and acrylyl-CoA hydratase activity. These results also indicate that 
expression of these polypeptides leads to the formation of 3-HP. 

Example 9 - Cloning nucleic acid molecules that encode 

15 a polypeptide having acetyl CoA carboxylase activity 

Polypeptides having acetyl-CoA carboxylase activity catalyze the first committed 
step of the fatty acid synthesis by carboxylation of acetyl-CoA to malonyl-CoA. 
Polypeptides having acetyl-CoA carboxylase activity are also responsible for providing 
malonyl-CoA for the biosynthesis of veiy-long-chain fatty acids required for proper cell 

20 function. Polypeptides having acetyl-CoA carboxylase activity can be biotin dependent 
enzymes in which the cofactor biotin is post-trarislationaUy attacned to a specific lysine 
residue. The reaction catalyzed by such polypeptides consists of two discrete half 
reactions. In the first half reaction, biotin is carboxylated by biocarbonate in an ATP- 
dependent reaction to form carboxybiotin. In the second half reaction, the carboxyl group 

25 istransfOTedtoacetyl-^Atoformmalonyl-CoA. 

Prokaryotic and eukaryotic polypeptides having acetyl-CoA carboxylase activity 
exist The prokaryotic polypeptide is a multi-subunit enzyme (four subunits), where each 
of the subunits is encoded by a different nucleic acid sequence. For example, in £ coli, 
the following genes encode for the four subunits of acetyl-CoA carboxylase: 

30 accA: Acetyl-coenzyme a carboxylase carboxyl transferase subunit alpha 

(GenBank® accession number M96394) 
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accB: Biotin^arboxyl carrier protein {OenBank® accession number U18997) 
accC: Biotin carboxylase (GenBank® accession number U 1 8997) 
accD:. Acetyl-coenzyme a carboxylase carboxyl transferase subunit beta 

(GenBank® accession number M68934) 
5 The eukaryotic polypeptide is a high molecular weight multi-functional-enzyme 

encoded by a single gene. For example, in Saccharomyces cerevisiae, the acetyi-CoA 

carboxylase can have the sequence set forth in GenBank® accession number M92156. 
The prokaryotic type acetyl-CoA carboxylase from E. coli was overexpressed 

using T7 promoter vector pFN476 as described elsewhere (Davis et al. J. Biol. Chem., 
10 275:28593-28598 <2000)). The eukaryotic type acetyl-CoA carboxylase gene was 

amplified from Saccharomyces cerevisiae genomic DNA. Two primers were designed to 

amplify the accl gene from in S. cerevisiae (acclF 5*- 

atagOCGGCCGCAGGMTGCTGTATGAGCGAAGAAAGCnTACT C-3\ SEQ ID 
NO: 138 where the bold is homologous sequence, the italics is a Not I site, the underline 
15 is a RBS, and the lowercase is extra; and acclR 5 '-atgctcgc^tCTC^CTAG- 

CT AAATTAAATT AC ATC AAT AGT A-3 ' , SEQ ID NO: 139 where the bold is 
homologous sequence, the italics is aATio I site, and the lowercase is extra). The 
following PGR mix is used to amplify acc\ gene 10X pfu buffer (10 \xL\ dNTP (lOmM; 
^^XcDT^ 

20 units/jiL; 2 fiL), and DI water (82 jxL). The following protocol was used to amplify the 
accl gene. After performing PCR, the PCR product was separated on a gel, and the band 
corresponding to accl nucleic acid (about 6.7 Kb) was gel isolated using Qiagen gel 
isolation kit The PCR fragment is digested with Not I mdXho I (New England BioLab) 
restriction enzymes. The digested PCR fragment is then ligated to pET30a which was 

25 restricted with Not I and Xho I and dephosphorylated with SAP enzyme. The Ecoli 
strain DH10B was transformed with 1 jiL of the ligation mix, and the cells were 
recovered in 1 mL ofSOC media. Hie transformed cells were selected on LB/kanamycin 
{50 yg/nL) plates. Eight single colonies are selected, and PCR was usedto screen for the 
correct insert The plasmid having correct insert was isolated using Qiagen Spin Mini 

30 prep kit 
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To obtain a polypeptide having acetyl-Co A carboxylase activity, the plasmid 
pMSD8 or pET30a/aecl overexpressing E. coli or £ cerevisiae acetyl-CoA carboxylase 
was transformed into Tuner pLacI chemically competent cells *(Novagen, Madison, WI). 
The transformed cells were selected on LB/chloramphenicol (25 |ig/mL) plus carbencillin 

5 {50 |ag/mL) orkanamyein{50fig/mL). 

A crude extract of this strain can be prepared in the following manner. An 
overnight culture,of Tuner pLacI with pMSD8 is subcultured into 200 mL {in one liter 
baffle culture flask) of fresh M9 media supplemented with 0.4% glucose, 1 ng/mL 
thiamine, 0.1% casamino acids, and50 \igfmL carbencillin or50 ng/mL kanamycin and 

10 25 |xg/mL chloramphenicol. The culture is grown at 37°C in a shaker with 250 ipm 

agitation until it reaches an optical density at 600 nm of about 0.6. IPTG is then added to 
a final concentration of 10b nM. The culture is then incubated for an additional 3 hours 
with shaking speed of 250 rpm at 37°C. Cells are harvested by centrifugation at 8000 x g 
and are washed one time with 0.85% NaCl. The cell pellet was resuspended in a minimal 

15 volume of 50 mM Tris-HCl (pH 8.0), 5 mM MgCl 2 , 100 mM KC1, 2 mM DTT, and 5% 
glycerol. The cells are lysed by passing them two times through a French Pressure cell at 
1 000 psig pressure. The cell debris was removed by centrifugation for 20 minutes at 
30,000 x g. 

The enzyme can be assayed using a method from Davis et al. (/. Biol Chent, 
20 275:28593-28598 (2000)). 

Example 10 - Cloning a nucleic acid molecule that encodes a polypeptide 
having malonvl-CoA reductase activity from Chloroflexus auarantiacus 
A polypeptide having malonyl-CoA reductase activity was partially purified from 
25 Chloroflexus auarantiacus and used to obtained amino acid micro-sequencing results. 
The amino acid sequencing results were used to identify and clone the nucleic acid that 
encodes a polypeptide having malonyl-CoA reductase activity. 

Biomass required for protein purification was grown in B. Braun BBOSTAT B 
fermenters (B. Braun Biotech International GmbH, Melsungen, Germany). A glass vessel 
30 fitted with a water jacket for heating was used to grow the required biomass. The glass 
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vessel was connected to its own control unit. The liquid working volume was 4 L, and 
the fermenter was operated at 55°C with 75 rpm of agitation. Carbon dioxide was 
occasionally bubbled through the headspace of the fermenter to maintain anaerobic 
conditions. The pH of the cultures was monitored using a standard pH probe and was 

5 maintained between 8.0 and 8.3. The inoculum for the fermenters was grown in two 250 
mL bottles in an innova 4230 Incubator, Shaker (New Brunswick Scientific, Edison, NJ) 
at 55°C with interior ligjits. The fermenters were illuminated by three 65 W plant light 
reflector lamps (General Electric, Cleveland, OH). Each lamp was placed approximately 
50 cm away from the glass vessel. The media used for the inoculum and the fermenter 

10 culture was as follows per liter: 0.07 g EDTA, 1 mL micronutrient solution, 1 mL FeCl 3 
solution, 0.06 g CaS0 4 -2 H 2 0, 0.1 g MgS0 4 -7 H 2 0, 0.008 gNaCl, 0.075 g KC1, 0.103 g 
KN0 3 , 0.68 g NaN0 3 , 0.11 1 g Na 2 HP0 4 , 0.2 g NH4CI, 1 g yeast extract, 2.5 g casamino 
acid, 0.5 g Glycyl-Glycine, and 900 mL DI water/ The micronutrient solution contained 
the following per liter: 0.5 mL H 2 S0 4 (cone), 2.28 g MnS0 4 -7 H 2 0, 0.5 g ZnS0 4 -7 H 2 0, 

15 0.5 g H3BO3, 0.025 g CuS0 4 -2 H 2 0, 0.025 g Na 2 Mo0 4 -2 H 2 0, and 0.045 g CoClr 6 H 2 0. 
Hie FeCb solution contained 0.2905 g FeCl 3 per liter. After adjusting the pH of the 
media to 8.2 to 8.4, 0.75 g/L Na 2 S-9H 2 0 was added, the pH was readjusted to 8.2 to 8.4, 
and the media was filter-sterilized through a 0*22 \i filter. 

The fermenter was inoculated with 500 mL of grown culture. The fermentation 

20 was stopped, and the biomass was harvested after the cell density was about 0.5 to 0.6 
units at 600 nm. 

The cells were harvested by centrifugation at 5000 x g (Beckman JLA 8.1000 
rotor) at 4°C, washed with 1 volume of ice cold 0.85% NaCl, and centrifuged again. The 
cell pellet was resuspended in 30 mL of ice cold 100 mM Tris-HCl (pH 7.8) buffer that 

25 was supplemented with 2 mM DTT, 5 mM MgCl 2 , 0.4 mM PEFABLOC {Roche 

Molecular Biochemicals, Indianapolis, IN), 1% streptomycin sulfate, and 2 tablets of 
Complete EDTA-free protease inhibitor cocktail (Roche Molecular Biochemicals, 
Indianapolis, IN). The cell suspension was lysed by passing the suspension, three times, 
through a 50 mL French Pressure Cell operated at 1 600 psi {gauge reading)- Cell debris 

30 was removed by centrifugation at 30,000 x g {Beckman JA 25.50 rotor). The crude 
extract was filtered prior to chromotography using a 0.2 \un HT TufBryn membrane 
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syringe filter (Pall Corp., Ann Arbor, MI). The protein concentration of the crude extract 
was 29 mg/mL, which was determined using the BioRad Protein Assay according to the 
manufacturer's microassay protocol. Bovine gamma globulin was used for the standard 
curve determination. This assay was based on the Bradford dye-binding procedure 
5 (Bradford, Anal Biochem., 72:248 (1976)), 

Before starting the protein purification, the following assay was used to determine 
the activity of malonyl-CoA reductase in the crude extract A SO \iL aliquot of the cell 
extract (29 mg/mL) was added to 10 nL 1M Tris-HCl (final concentration in assay 100 
mM), 10 10 mM malonyl CoA (final concentration in assay 1 mM), 5.5 nL 5.5 mM 
10 NADPH (final concentration in assay 0.3 mM), and 24.5 jiL DI water in a 96 well UV 
transparent plate (Corning, NY). The enzyme activity was measured at 45°C using 
SpectraMAX Plus 96 well plate reader (Molecular devices Sunnyvale, CA). The activity 
of malonyl-CoA reductase was monitored by measuring the disappearance of NADPH at 
340 nm wavelengthr-The crude e xtra ctex hib it ed mal o nyK3 oA reductase activity. 
15 The 5 mL (total 1 45 mg) protein cell extract was diluted with 20 mL buffer A (20 

mMethanolamine(pH9.0),5mMMgCl2,2inMDTT). The chromatographic protein 
purification was conducted using a BioLogic protein purification system (BioRad 
_JHerculK^CA^^ ion-exchange 

^umn-that ha^ sample 

20 loading, the column was washed with 2.5 times column volume of buffer A at a rate of 2 
mL/minute. The proteins were eluted with a linear gradient of NaCl in buffer A from 0- 
0 31 Min 25-Column^lume^urin^ three mL 

fractions were collected. The collection tubes -contained 50 yL of Tris-HCl (pH 3 S) so 
that the pH of the eluted sample dropped to about pH 7. Major chromatographic peaks 
25 were detected in the region that corresponded to fractions 1 8 to 21 and 23 to 30, A 200 
jiL sample was taken from these fractions and concentrated in a microcentrifuge at 4°C 
using a Microcon YM-10 columns (Millipore Corp., Bedford, MA) as per manufacture's 
instructions. To each of the concentrated fraction, buffer A-Tris (100 mM Tris-HCl (pH 
7.8), 5 mM MgCl 2 , 2 mM DTT) was added to bring the total volume to 100 jiL. Each of 
30 these fractions was tested for the malonyl-CoA reductase activity using the 

spectophotometric assay tlescribed above. The majority of specific malonyl CoA activity 
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was found in fractions 1 8 to 21 . These fractions were pooled together, and the pooled 
sample was desalted using PD-10 column (Amersham Pharmacia Piscataway, NJ) as per 
manufacture's instructions. 

The 10.5 mL of desalted protein extract was diluted with buffer A-Tris to a 

^ volume of 25 mL. This desalted diluted sample was applied to a 1 mL HiTrap Blue 
column (Amersham Pharmacia Piscataway, NJ) which was equilibrated with buffer A- 
Tris. The sample was loaded at a rate of 0.1 mL/minute. Unbound proteins were washed 
with 2.5 CV buffer A-Tris. The protein was eluted with 1 00 Mm Tris <pH 7.8), 5 mM 
MgCl 2 , 2 mM DTT, 2mM NADPH, and I 'M NaCl. During this separation process, one 

10 mL fractions were collected. A 200 \xL sample was drawn from fractions 49 to 54 and 
concentrated Buffer A-Tris was added to each of the concentrated fractions to bring the 
total volume to 100 pL. Fractions were assayed for enzyme activity as described above. 
The highest specific activity was observed in fraction 5 1 . The entire fraction 5 1 was 
concentrated as described above, and the concentrated sample was separated on an SDS- 

15 PAGE gel. 

Electrophoresis was carried out using a Bio-Rad Protean II minigel system and 
pre-cast SDS-PAGE gels (4-15%), or a Protean II XI system and 16 cm x 20 cm x 1mm 
SDS-PAGE gels (10%) cast as per the manufacturer's protocol. The gels were run 
according to the manufacturer's instructions with a running buffer of 25 mM Tris-HCl 
20 (pH 8.3), 192 mM glycine, and 0.1% SDS. 

A gel thickness of 1 mm was used to run samples from fraction 51. Protein from 
fraction 51 was loaded onto 10% SDS-PAGE (3 lanes, each containing 75 jig of total 
protein). The gels were stained briefly withCoomassie blue (Bio-Rad, Hercules, CA) and 
then destained to a clear background with a 10% acetic acid and 20% methanol solution. 
25 The staining revealed a band of about 130 to 140 KDa. 

The protein band of about 130-140 KDa was excised with no excess unstained gel 
present. An equal area gel without protein was excised as a negative control. The gel 
slices were placed in uncolored microcentrifuge tubes, prewashed with 50% acetonitrile 
in HPLC-grade water, washed twice with 50% acetonitrile, and shipped on dry ice to 
30 Harvard Microchemistry Sequencing Facility, Cambridge, MA. 
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After in-situ enzymatic digestion of the polypeptide sample with trypsin, the 
resulting polypeptides were separated by micro-capillary reverse-phase HPLC. The 
HPLC was directly coupled to the nano-electrospray ionization source of a Finnigan LGQ 
quadrupole ion trap mass spectrometer {fiLC/MS/MS j. Individual sequence spectra 

5 (MS/MS) were acquired on-line at high sensitivity for the multiple polypeptides separated 
during the chromatographic run. The MS/MS spectra of the polypeptides were correlated 
with known sequences using the algorithm Sequest developed at the University of 
Washington <Eng et a/., J. Am Soc. Mass Spectrom., 5:976 (1994)) and programs 
developed at Harvard (Chittum et aL, Biochemistry, 37:10866 (1 998)). The results were 

10 reviewed for consensus with known proteins and for manual confirmation of fidelity. 

A similar purification procedure was used to obtain another sample (protein 1 
sample) that was subjected- to the same analysis that was used to evaluate the fraction 51 
sample. 

The polypeptide sequence results indicated thatihe polypeptides obtained from 

15 both the fraction 5 1 sample and the protein 1 sample had similarity to the six (764, 799, 
859, 923, 1090, 1024) contigs sequenced from the C aurantiacus genome and presented 
on the Joint Genome Institute's web site (http://www.jgi.doe.gov/). The 764 contig was 
the most prominent of the six with about 40 peptide sequences showing similarity. 
BLASTX analysis of each of these contigs on the GenBank web site 

20 (http^/wwwjicbi.nlm jiih.gov/BLAST/) indicated that the DNA sequence of the 764 
contig (420 1 bases) encoded for polypeptides that had a dehydrogenase/reductase type 
activity. Close inspection of the 764 contig, however, revealed that this contig did not 
have an appropriate ORF that would encode for a 130-140 KDa polypeptide. 

BASLTX analysis also was conducted using the other five contigs. The results of 

25 this analysis were as follows. The 799 contig (3173 bases) appeared to encode 

polypeptides having phosphate and dehydrogenase. type activities. The 859 contig <5865 
bases) appeared to encode polypeptides having synthetase type activities. The 923 contig 
<5660 bases) appeared to encode polypeptides having elongation factor and .synthetase 
type activities. The 1090 contig (15201 bases) appeared to encode polypeptides having 

30 dehydrogenase/reductase and cytochrome and sigma factor activities. The 1024 contig 
<12276 bases) appeared to encode polypeptides having dehydrogenase and decarboxylase 
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and synthetase type activities. Thus, the 859 and 923 contigs were eliminated from any 
further analysis. 

The results from the BLASTX analysis also revealed that the dehydrogenase 
found in the 1024 contig was most likely an inositol monophosphate dehydrogenase. 

5 Thus, the 1 024 contig was eliminated as a possible candidate that might encode for a 
polypeptide having malonyl-CoA reductase activity. The 799xontig also was eliminated 
since this contig is part of the OS 17 polypeptide described above. 

This narrowed down the search to 2 contigs, the 764 and 1090 contigs. Since the 
contigs were identified using the same protein sample and the dehydrogenase activities 

10 found in these contigs gave very similar BLASTX results, it was hypothesized that they 
are part of the same polypeptide. Additional evidence supporting this hypothesis was 
obtained from the discovery that the 764 and 1090 contigs are adjacent to each other in 
the C. aurantiacus genome as revealed by an analysis of scaffold data provided by the 
Joint Genome Institute. Sequence similarity and assembly analysis, however, revealed no 

1 5 overlapping sequence between these two contigs, possibly due to the presence of gaps in 
the genome sequencing. 

The polypeptide sequences that belonged to the 764 and 1090 contigs were 
mapped. Based on this analysis, an appropriate coding frame and potential start and stop 
codons were identified. The following PCR primers were designed to PCR amplify a 

20 fragment that encoded for a polypeptide having malonyl-CoA reductase activity: 

PRO140F 5'-ATGGCGACGGGCGAGTCCATGAG-3', SEQ ID NO:153; PRO140R5'- 
GGACACGAAGAACAGGGCGACAC-3', SEQ ID NO:154; and PRO140UP 5'- 
GAACTGTCTGGAGTAAGGCTGTC-3', SEQ ID NO:155. The PRO140F primer was 
designed based on the sequence of the 1090 contig and corresponds to the start of the 

25 potential start codon. The twelfth base was change from G to C to avoid primer-dimer 
formation. This change does not change the amino acid that was encoded by the codon. 
The PRO140R primer was designed based on sequence of the 764 contig and corresponds 
to a region located about 1 kB downstream from the potential stop codon. The 
PRO140UPF primer was designed based on sequence of the 1090 contig and corresponds 

30 to a region located about 300 bases upstream of potential start codon. 
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tjenomic C. aurantiacus DNA was obtained. Briefly, C aurantiacus was grown 
in 50 mL cultures for 3 to 4 days. Cells were pelleted and washed with 5 mL of a 10 raM 
Tris solution. The genomic DNA was then isolated using the gram positive bacteria 
protocol provided with Centra Genomic "Puregene" DNA isolation kit (Gentra Systems, 
Minneapolis, MN). The cell pellet was resuspended in 1 mL Gentra Cell Suspension 
Solution to which 14.2 mg of lysozyme and 4 jiL of 20 mg/mL proteinase K solution was 
added. The cell suspension was incubated at 37°C for 30 minutes. The precipitated 
genomic DNA was recovered by centrifugation at 3500g for 25 minutes and air-dried for 
1 0 minutes. The genomic DNA was suspended in an appropriate amount of a 10 mM 
Tris solution and stored at 4°C. 

Two PGR reactions were set-up using C. aurantiacus genomic DNA as template 
as follows: 



PCR Reaction #1 




PCR program 


3.3 X rTH polymerase Buffer 


30 uL 


94°C 


2 minutes 


Mg(OAC)(25mM) 


4pL 


29 cycles of: 


dNTP Mix (10 mM) 


3uL 




94°C 30 seconds 


PRO140F (100 uM) 


2uL 




63°C 45 seconds 


PRO140R(100uM) 


2uL 




S8°C 4.5 minutes 


Genomic DNA (100 ng/mL) 


IjiL 


68°C 


7 minutes 


rTH polymerase (2 U/uL) 


2uL 


4°C 


Until further use 


pfu polymerase (2.5 U/uL) 


0.25 uL 






DI water 


55.75 uL 






Total 


100 uL 






PCR Reaction #2 




PCR Droeram 


3.3 X rTH polymerase Buffer 


30 pL 


94°C 


2 minutes 


Mg(OAC)(25mM) 


4uL 


29 cycles of: 


dNTP -Mix (10 mM) 


3|iL 




,94°C 30 seconds 


PRO140UPF (100 uM) 


2uL 




60°C 45 seconds 


PROWOR(lOOuM) 


2uL 




68°C 4.5 minutes 



15 



20 



25 



30 
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Genomic DNA (100 ng/mL) 1 nL 

rTH polymerase <2 U/jiL) 2 |iL 

jpfu polymerase 2.5 U/pL) 0.25 pL 

DI water 55.75 pL 

Total 100pL 



68°C 7 minutes 

4°C Until further use 



The products from both PCR reactions were separated on a 0.8% TAE gel. Both 
PCR reactions produced a product of 4.7 to 5 Kb in size. This approximately matched the 
expected size of a nucleic acid molecule that could encode a polypeptide having malonyl- 

10 CoA reductase activity. 

Both PCR products were sequenced using sequencing primers (1090Fseq5- 
GATTCCGTATGTCACCCCTA-3', SEQ ID NO:156; and 764Rseq 5'- 
CAGGCGACTGGCAATCACAA-3 1 , SEQ ID NO:157). The sequence analysis revealed 
a gap between the 764 and 1090 contigs. The nucleic acid sequence between the 

1 5 sequences from the764 and 1 090 contigs was greater than 300 base pairs in length (Figure 
51). In addition, the sequence analysis revealed an ORF of 3678 bases that showed 
similarities to dehydorgenase/reductase type enzymes {Figure 52). The amino acid 
sequence encoded by this ORF is 1225 amino acids in length (Figure 50). Also, BLASTP 
analysis oTfBFamino acid se^n^ encoded by tins ORF revealed two short chain 

20 d^ydbogenase domains (adh type). These results are consistent with a polypeptide 
having malonyl-CoA reductase activity since such an enzyme involves two reduction 
steps for the conversion of malonyl CoA to 3-HP. Further, the computed MW of the 
polypeptide was T determined to be ato 

PCR was conducted using the PRO140F/PRO140R primer pair, C. aurantiacus 

25 genomic DNA, and the protocol described above as PCR reaction #1 . After the PCR was 
completed, 0.25 U of Taq polymerase {Roche Molecular Biochemicals, Indianapolis, IN) 
was added to the PCR mix, which was then incubated at 72°C for 10 minutes. The PCR 
product was column purified using Qiagen PCR purification kit (Qiagen Inc., Valencia, 
CA). The purified PCR product was then TOPO cloned into expression vector 

30 pCRT7/CT as per manufacture's instructions (Invitrogen, Carlsbad, CA). TOP10 F* 
chemical -competent cells were transformed with the TOPO ligation mix as per 
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manufacture's instructions (Invitrogen, Carlsbad, CA). The cells were recovered for half 
an hour, and the transformants were selected on LB/ampicillin (100 p.g/mL) plates. 
Twenty single colonies were selected, and the plasmid DNA was isolated using Qiagen 
spin Mini prep kit (Qiagen Inc., Valencia, CA). 
5 Each of these twenty clones were tested for correct orientation and right insert-size 

by PCR- Briefly, plasmid DNA was used as a template, and the following two primers 
were used in the PCR amplification: PCRT7 ^'-GAGACCACAACGGTTTCCCTCTA- 
3', SEQ ID NO:158; and PRO140R 5-GGACACGAAGAACAGGGCGACAC-3', SEQ 
ID NO: 159. The following PCR reaction mix and program was used: 

10 

PCR Reaction PCR program 



3.3 X rTH polymerase Buffer 


7.5 nL 


94°C 2 minutes 


Mg(OAC)<25mM) 


luL 


25 cycles of: 


15 dNTP Mix (10 mM) 


0.5 uL 


94°C 30 seconds 


PCRT7<100uM) 


0.125 uL 


55°C 45 seconds 


PRO140R(100nM) 


0.125 uL 


68°C 4 minutes 


Plasmid DNA 


0.5 uL 


t68°C 7 minutes 


rTH polymerase (2 U/uL) 


0.5 uL 


4°C Until further use 


20 DI water 


14.75 uL 




Total 


25 uL 





Out of twenty clone tested, only one clone exhibited the correct insert (Clone # P- 
10). Chemical competent cells of BL21(DE3)pLysS (Invitrogen, Carlsbad, CA) were 
25 transformed with 2 uL of the P-10 plasmid DNA as per the manufacture's instructions. 
The cells were recovered at 37°C for 30 minutes and were plated on LB ampicillin (100 
ug/mL) and chloramphenicol (25 ug/mL). 

A 20 mL culture of BL21(DE3)pLysS/P-10 and a 20 mL control culture of 
BL21<DE3)pLysS was incubated overnight Using the overnight cultures as an inoculum, 
30 two 100 mL BL21<DE3)pLysS/P-10 clone cultures and two control strain cultures 
<BL21(DE3)pLysS) were started. All the cultures were induced with IPTG when they 
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reached an OD of about 0.5 at 600 nm. The control strainxulture was induced with 10 
jxM IPTG or 100 jjM IPTG, while one of the BL21(DE3)pLysS/P-10 clone cultures was 
induced with 10 \M IPTG and the other with 100 \M IPTG. The cultures were grown for 
2.5 hours after induction. Aliquots were taken from each of the culture flasks before and 
5 after 2.5 hours of induction and separated using 4- 1 5% SDS-PAGE to analyze 
polypeptide expression. In the induced BL2I(DE3)pLysS/P-10 samples, a band 
corresponding to a polypeptide having a molecular weight of about 135 KDa was 
observed. This band was absent in the control strain samples and in samples taken before 
IPTG induction. 

10 To assess malonyl-CoA reductase activity, BL21{DE3)pLysS/P-10 and 

BL21(DE3)pLysS cells were cultured and then harvested by centrifugation at 8,000 xg 
<Rotor JA 1 6.250, Beckman Coulter, Fullerton, CA). Once harvested, the cells were 
washed once with an equal volume of a 0.85% NaGl solution. The cell pellets were 
resuspended into 100 mM Tris-HCl buffer that was supplemented with 5 mM Mg2Cl and 

15 2 mM DTT. The cells were disrupted by passing twice through a French Press Cell at 
1 ,000 psi pressure (Gauge value). The cell debris was removed by centrifugation at 
30,000 x g (Rotor JA 25.50, Beckman Coulter, Fullerton, CA). The cell extract was 
maintained at 4°C or on ice until further use. 

Activity of malonyl-CoA reductase was measured at 37°C for both the control 

20 cells and the IPTG-induced cells. The activity of malonyl-CoA reductase was monitored 
by observing the disappearance of added NADPH as described above. No activity was 
found in the cell extract of the control strain, while the cell extract from the IPTG-induced 
BL21(DE3)pLysS/P-10 cells displayed malonyl-CoA reductase activity with a specific 
activity calculated to be about 0.0942 fimole/minute/mg of total protein. 

25 Malonyl-CoA reductase activity also was measured by analyzing 3-HP formation 

from malonyl CoA using the following reaction conducted at 37°C: 

Volume Final cone. 

TrisHCl(lM) lOfiL lOOmM 

Malonyl CoA (lOmM) 40 jiL 4 mM 

30 NADPH (10 mM) 30 nL 3 mM 

Cell extract 20 nL 
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Total 100 pL 

The reaction was carried out at 37°C for 30 minutes. In the control reaction, a cell 
extract from BL21(DE3)pLysS was added to a final concentration of 322 mg total 

5 protein. In the experimental reaction mix, a cell extract from.BL21(DE3)pLysS/P-10 was 
added to a final concentration of 226 mg of total protein. The reaction mixtures were 
frozen at -20°C until further analysis. 

Chromatographic separation of the components in the reaction mixtures was 
performed using a HPX-87H <7.8x300mm) organic acid HPLC column (BioRad 

10 Laboratories , Hercules, CA). The column was maintained at 60°C. Mobile phase 

composition was HPLC grade water pH to 2.5 using triflouroacetic acid (TFA) and was 
delivered at a flow rate of 0.6 mL/minute. 

Detection of 3-HP in the reaction samples was accomplished using a 
Waters/Micromass ZQ LC/MS instrument consisting of a Waters 2690 liquid 

15 chromatograph {Waters Corp., Milford, MA) with a Waters 996 Photo-diode Array 
(PDA) absorbance monitor placed in series between the chromatograph and the single 
quandrupole mass spectrometer. The ionization source was an Atmospheric Pressure 
Chemical Ionization (APCI) ionization source. All parameters of the AFCI-MS system 
wereroptimized tind selected based on the generation of the protonatedmolecular ion 

20 (£M+H]) + of 3-HP. The following parameters were used to detect 3-HP in the positive 
ion mode: Corona: 10 \iA; Cone: 20V; Extractor: 2V; RF lens: 02V; Source temperature: 
100°C; APCI Probe temperature: 300°C; Desolvation gas: 500L/hour, Cone gas: 
50L/hour; Low mass resolution: 15; High mass resolution: 15; Ion energy: 1.0; 
Multiplier 650. Data was collected in Selected Ion Reporting (SIR) mode set at m/z = 

25 90.9. 

Both the control reaction sample and the experimental reaction sample were 
probed for presence of 3-HP using the HPLC-mass spectroscopy technique. In the 
control samples, no 3-HP peak was observed, while the experimental sample exhibited a 
peak that matched the retention and the mass of 3-HP. 

30 
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Example 11 - Constructing recombinant cells that produce 3-HP 
A pathway to make 3-hydroxypropionate directly from glucose via acetyl CoA is 
presented in Figure 44. Most organisms such as E. coli, Bacillus, and yeast produce 
acetyl CoA from glucose via glycolysis and the action of pyruvate dehydrogenase. In 
5 order to divert the acetyl CoA generated from glucose, it is desirable to overexpress two 
genes, one encoding for acetyl CoA carboxylase and the other encoding malonyl-CoA 
reductase. As an example, these genes are expressed in E. coli through a 17 promoter 
using vectors pET30a and pFN476. The vector pEDOa has a pBR ori and kanamycin 
resistance, while pFN476 has pSClOl ori and uses carbencillin resistance for selection. 
1 0 Because these two vectors have compatible ori and different markers they can be 

maintained in E. coli at the same time. Hence, the constructs used to engineer E coli for 
direct production of 3-hydroxypropionate from glucose are pMSD8 <pFN476/accABCD) 
(Davis et al., J. Biol. Chem., 275:28593-28598, 2000) and pET30a/malonyl-CoA 
reductase or pET30a/accl and pFN476/malonyl-CoA reductase. The constructs are 
15 depicted in Figure 45. 

To test the production of 3-hydroxypropionate from glucose, K coli strain Tuner 
pLacI carrying plasmid pMSD8 (pFN4767accABCD) and pET30a/malonyl-CoA 
reductase or pET30a/accl and pFN476/malonyl-CoA reductase are grown in a B. Braun 
BIOSTAT B fermenter. A glass vessel fitted with a water jacket for heating is used to 
20 conduct this experiment The fennentex working volume isl.5 L and is operated at 37°C. 
The fermenter is continuously supplied with oxygen by bubbling sterile air through it at a 
rate of 1 we The agitation is cascaded to the dissolve oxygen concentration which is 
maintained at 40% DO. The pH of the liquid media is maintained at 7 using 2 N NaOH. 
The E coli strain is grown in M9 media supplemented with 1% glucose, 1 ug/mL 
25 thiamine, 0.1% cas amino acids, 10 ug/mL biotin, 50 ug/mL carbencillin, 50 ug/mL 
kanamycin, and 25 ug/mL chloramphenicol. The expression of the genes is induced 
when the cell density reached 0.5 OD(600nm) by adding 100 uM IPTG. After induction, 
samples of 2 mL volume are taken at 1, 2, 3, 4, and 8 hours. In addition, at.3 hours after 
induction, a 200 mL sample is taken to make a cell extract The 2 mL samples are spun, 
30 and the supernatant is used to analyze products using LC/MS technique. The supernatant 
is stored at -20°C until further analysis. 
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The extract is prepared by spinning the 200 mL of cell suspension at 8000 g and 
washing the cell pellet with of 50 mL of 50 mM Tris-HCl (pH 8.0), 5 mM MgCl 2 , 100 
mM KC1, 2 mM DTT, and 5% glycerol. The cell suspension is spun again at 8000 g, and 
the pellet is resuspended into 5 mL of 50 mM Tris-HCl (pH 8,0), 5 mM MgCi 2 , 100 mM 

5 KCI, 2 mM DTT, and 5% glycerol. The cells are disrupted by passing twice through a 
French Press at 1000 pisg. The cell debris is removed by centrifugation for 20 minutes at 
30,000 g. All the, operations are conducted at 4°C. To demonstrated in vitro formation of 
3-hydroxypropionate using this recombinant rail extract, the following reaction of 200 \xL 
is conducted at 37°C. The reaction mix is as follows: Tris HC1 (pH 8.0; 100 mM), ATP 

10 (1 mM), MgC12 (5 mM), KCI (100 mM), DTT<5 mM),NaHC03 <40 mM),NADPH{0.5 
mM), acetyl CoA (0.5 mM), and cell extract (0.2 mg). The reaction is stopped after 15 
minutes by adding 1 volume of 10% trifluroacetic acid (TFA). The products of this 
reaction are detected using an LC/MS technique. 

The detection and analysis for the presence of 3"-hydroxypropionate in the 

1 5 supernatant and the in vitro reaction mixture is carried out using a Waters/Mkromass ZQ 
LC/MS instrument This instrument consists of a Waters 2690 liquid chromatograph with 
a Waters 2410 refractive index detector placed in series between the chromatograph and 
the single quadropole mass spectrometer. LC separations are made using a Bio-Rad 
Aminex 87-H ion-exchange column at 45°C Sugars, alcohol, and organic acid products 

20 are eluted with 0.015% TFA buffer. For elution, the buffer is passed at a flow rate of 0.6 
mL/minute. For detection and quantification of 3-hydroxypropionate, a sample obtained 
from TCI, America (Portland, OR) is used as a standard. 

Example 12 Cloning of propionvl-CoA transferase, lactvl-CoA d ehydratase (LDH). 
25 and a hvdratase (OS19) for Expression in Saccharomvces cerevisiae 

The pESC Yeast Epitope Tagging Vector System <S tratagene, La Jolla, CA) was 

used in cloning the genes involved in 3-hydroxypropionic acid production via lactic acid. 

The pESC vectors each contain GAL1 and GAL10 promoters in opposing directions, 

allowing the expression of two genes from each vector. The GAL1 and OAL10 
30 promoters are repressed by glucose and induced by galactose. Each of the four available 

pESC vectors has a different yeast-selectable marker (MS3, TRP1, LEU2, URA3) so 
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multiple plasmids can be maintained in a single strain. Each cloning region has a 
polylinker site for gene insertion, a transcription terminator, and an epitope coding 
sequence for C-terminal or N-terminal epitope tagging of expressed proteins. The pESC 
vectors also have a ColEl origin of replication and an ampicillin resistance gene to allow 
5 replication and selection in K colu The following vector/promoter/nucleic acid 
combinations were constructed: 



Vector 


Promoter 


Polypeptide 


Source of nucleic acid 


pESC-Trp 


GAL1 


OS19hydratase 


Chloroflexus aurantiacus 




GAL10 


El 


Megasphaera elsdenii 


pESC-Leu 


GAL1 


E2a 


Megasphaera elsdenii 




GAL10 • 


E20 


Megasphaera elsdenii 


pESC-His 


GAL1 


D-LDH 


"Escherishia coli 




GAL10 


PCT 


Megasphaera elsdenii 



The primers used were as follows: 
10 OS19APAF: 5'-ATAGGGCCCAGGAGATCAAACCATGGGTGAAGAGTCT- 
CTGGTTC-3' (SEQ ID NO: 164) 

OS19SA LR: 5'-CCTCTGCTAC AG TCGACACAACGA CCACTGAAGTTG- 
GGAG-3'(SEQ IDNO:165) 

OS19KPNR: S'-AGTCTGCTATCGGTACCTCAACGACCACTGAAGTTG- 

15 GGAG-3'(SEQ ED.NO:166) 

E1NOTF: S'-ATAGCGGCCGCATAATGGATACTCTCGGAATCGACG- 

TTGG-3'(SEQ ID NO:167) 

EICLAR: 5*^CCCATCGATACATATTTCTTGATTTTATCATAAGCA- 

ATC-3 '(SEQ JD 140:168) 
20 EHoAPAF: ^'-CCAGGGCCCATAATGGGTGAAGAAAAAACAGTAGA- 

TATTG-3XSEQ ID NO:169) 

EIIaSALR: S'-GGTAGACTTGTCGAGGTAGTGGTTTCCTCCTTCATT- 
GG-3'(SEQIDNO:170) 

EH0NOTF: 5'-ATAGCGGGCGCATAATGGGTCAGATCGACGAACTTA- 
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TCAG-3'<SEQ ID NO:171) 

EnpSPER: 5'-ACKjTTCAACTAGTT^ 

CTG-3XSEQ.ID NO:172) 

LDHAPAF: S'-CTAGGGCCCATAATGGAACTCXjCCGTTTATAG- 
5 CAC-3*(SEQIDNO:173) 

LDHXHOR: 5*-ACTTCTCGAGTTAAACCAGTTCGTTGGGGCA- 
GGT-3 '{SEQ ID NO: 1 74) 

PCTSPEF: 5 '-GGGACTAGTATAATGGGAAAAGTAGAAATCAT- 
TACAG-3'(SEQ ID NO: 175) ' 
10 PCTPACR: 5 ' -CGGCTTAATTAAC AGC AG AG ATTTATTTTTTC A- 
GTCC-3XSEQ ID NO:176) 

"AU restriction ei^ from New England Biblabs, Beverly, MA. 

All plasmid DNA preparations were done using QIAprep Spin Miniprep Kits, and all gel 
purifications were done using QIAquick Gel Extraction Kits (Qiagen, Valencia, CA). 

15 

A. Construction of the oESC-Trp/OS19 hvdratase vector 

Two constructs in pESC-Trp were made for the OS 19 nucleic acid from C. 
aurantiacus. One of these constructs utilized the Apa I and Sal I restriction sites of the 
GAL1 multiple cloning site and was designed to include the c-myc epitope. The second 
20 construct utilized the Apa I and Kpn I sites and thus did not include the c-myc epitope 
sequence. 

Six ng of pESC-Trp vector DNA was digested with the restriction enzyme Apa I 
and the digest was purified using a QIAquick PCR Purification Column. Three |xg of the 
Apa I-digested vector DNA was then digested with the restriction enzyme Kpn I, and 3 ng 
25 was digested with Sal I. The double-digested vector DNAs were separated on a 1% TAE- 
agarose gel, purified, dephosphorylated with shrimp alkaline phosphatase (Roche 
Biochemical Products, Indianapolis, IN), and purified with a QIAquick PCR Purification 
Column. 

The nucleic acid encoding the Chloroflexus aurantiacus polypeptide having 
30 hydratase activity (OS 1 9) was amplified from -genomic DNA using the PCR primer pair 
OS19APAF and OS19SALR and the primer pair OS19APAF and OS19KPNR. 
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OS19APAF was designed to introduce mApa I restriction site and a translation initiation 
site (ACCATGG) at the beginning of the amplified fragment The OS19KPNR primer 
was designed to introduce a Kpn I restriction site at the end of the amplified fragment and 
to contain the translational stop codon for the hydratase gene. OS 19S ALR introduces a 
5 Sal I site at the end of the amplified fragment and has an altered stop codon so that 

translation continues in-frame through the vector c-myc epitope. The PGR mix contained 
the following: IX Expand PCR buffer, 100 ng C aurantiacus genomic DNA, 0.2 pM of 
each primer, 0.2'mM each dNTP, and 5.25 units of Expand DNA Polymerase (Roche) in 
a final volume of 100 \iL. The PCR reaction was performed in an MJ Research PTC100 
10 under the following conditions: an initial denaturation at 94°C for 1 minute; 8 cycles of 
94°C for 30 seconds, 57°C for 1 minute, and 72°C for 2.25 minutes; 24 cycles of 94°Cfor 
30 seconds, 62°C for 1 minute, and 72°C for 2.25 minutes; and a final extension for 7 
minutes at 72°C. The amplification product was then separated by gel electrophoresis 
using a 1% TAE-agarose gel. A 0.8 Kb fragment waff excised from the gel and puriffed 
15 for each primer pair. The purified fragments were digested with Kpn I or Sal I restriction 
enzyme, purified with a QIAquick PCR Purification Column, digested with Apa I 
restriction enzyme, purified again with a QIAquick PCR Purification Column, and 
quantified on a minigel. 

50-60 ng of die digested PCR product containing the nucleic acid encoding the C 
20 aurantiacus polypeptide having hydratase activity <OS 19) and 50 ng of the prepared 

pESC-Trp vector were Iigated using T4 DNA ligase at 16°C for 16 hours. One jiL of the 
ligation reaction was used to electroporate 40 \xL ofE. coli Electromax™ DH10B™ cells. 
The electroporated cells were plated onto LB plates containing 100 \igfwL of 
carbenicillin (LBC). Individual colonies were screened using colony PCR with the 
25 appropriate PCR primers. Individual colonies were suspended in about 25 yL of 1 0 mM 
Tris, and 2 yL of the suspension was plated on LBC media. The remnant suspension was 
heated for 10 minutes at 95°C to break open the bacterial cells, and 2 yL of the heated 
cells was used in a 25 yL PCR reaction. The PCR mix contained the following: IX Taq 
buffer, 0.2 each primer, 0.2 mM each dNTP, and 1 unit of Taq DNA polymerase per 
30 reaction. The PCR program used was the same as described above for amplification of 
the nucleic acid from genomic DNA. 
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Plasmid DNA was isolated from cultures of colonies having the desired insert and 
was sequenced to confirm the lack of nucleotide errors from PCR A construct with a 
-confirmed sequence was transformed into S. cerevisiae strain YPH50O using a Frozen-EZ 
Yeast Transformation IT™ Kit<Zymo Research, Orange, CA). Transformation reactions 

5 were plated on SC-Trp mediate Stratagene pESC Vector Instruction Manual for media 
recipes). Individual yeast colonies were screened for the presence of the OS19 nucleic 
acid by colony PCR Colonies were suspended in 20 ]iL of Y-Lysis Buffer <Zymo 
Research) containing 5 units of zymolase and heated at 37°C for 10 minutes. Three pL 
of this suspension was then used in a 25 uL PCR reaction using the PCR reaction mixture 

1 0 and program described for the colony screen of the K cbli transformants. The pESC-Trp 
vector was also transformed into YPH500 for use as a hydratase assay control and 
transformants were screened by PCR using GAL1 and GAL 10 primers. 

R. ^onstnit^on-cfme-pESC-Trp/GSi9^Bfhvdr 



15 Plasmid DNA of a pESC-Trp/OS19 construct (Apa l-Scdl sites) with confirmed 

sequence and positive assay results was used for insertion of the nucleic acid for the M. 
elsdenii El activator polypeptide downstream of the GAL10 promoter. Three ug of 
_4?lasmid DNA.3^jiigejtedAWthJhe restriction enzyme C/al^dJtejUgest was purified 
-iisiflg-a^IA^ituekdr^^ the 
20 restriction enzyme Not I, and the digest was inactivated by heating to 65°C for 20 

minutes. The double-digested vector DNA was dephosphorylated with shrimp alkaline 
^hosphatase^loche^eparated ^n al^TAE^agarose gel, and gel purified. 

The nucleic acid encoding the M elsdenii El activator polypeptide was amplified 
from genomic DNA using the PCR primer pair EINOTF and EICLAR. EINOTF was 
25 designed to introduce a Not I restriction site and a translation initiation site at the 

beginning of the amplified fragment The EICLAR primer was designed to introduce a 
Cla I restriction site at the end of the amplified fragment and to contain an altered 
translational stop codon to allow in-frame translation of the FLAG epitope, The PCR mix 
contained the following: IX Expand PCR buffer, lOOngM elsdenii genomic DNA, 0.2 
30 uM of each primer, 0.2 mM each dNTP, and 5.25 units of Expand DNA Polymerase in a 
final volume of 100 uL. The PCR reaction was performed in an MJ Research PTC 100 
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under the following conditions: an initial denaturation at 94°C for 1 minute; 8 cycles of 
94°C for 30 seconds, 55°C for 45 seconds, and 72°C for 3 minutes; 24 cycles of 94°C for 
30 seconds, 62°C for 45 seconds, and 72°C for 3 minutes; and a final extension for 7 
minutes at 72°C. The amplification product was then separated by gel electrophoresis 
5 using a 1% TAE-agarose gel, and a 0.8 Kb fragment was excised and purified. The 
purified fragment was digested with Cla I restriction enzyme, purified with a QIAquick 
PCR Purification Column, digested with Not I restriction enzyme, purified again with a 
•QIAquick PCR Purification Column, and quantified on a ininigel. 

60 ng of the digested PCR product containing the nucleic acid for the M elsdenii 
10 El activator polypeptide and 70 ng of the prepared pESC-Trp/OS 19 hydratase vector 
were ligated using T4 DNA ligase at 16°C for 16 hours. One uL of the ligation reaction 
was used to electroporate 40 uL of £ noli Electromax™ DH10B™ cells. The 
electroporated cells were plated onto LBC media Individual colonies were screened 
using colony PCR with me EINOTF and EICLAR primers. Individual colonies were 
1 5 suspended in about 25 uL of 10 mM Tris, and 2 uL of the suspension was plated on LBC 
media The remnant suspension was heated for 10 minutes at 95°C to break open the 
bacterial cells, and 2 uL of the heated cells used in a 25 uL PCR reaction. The PCR mix 
contained the following: IX Taq buffer, 0.2 uM each primer, 02 mM each dNTP, and 1 
unit of Taq DNA polymerase per reaction. The PCR program used was the same as 
20 described above for amplification of the gene from genomic DNA Plasmid DNA was 
isolated from cultures of colonies having the desired insert and was sequenced to confirm 
the lack of nucleotide errors from PCR 

C. Construction of the pESC-Leu/EIIa/EIIB vector 

25 Three ug of DNA of the vector pESC-Leu was digested with the restriction 

enzyme Apa I, and the digest was purified using a QIAquick PCR Purification Column. 
The vector DNA was men digested with the restriction enzyme Sal I, and the digest was 
inactivated by heating to 65°C for 20 minutes. The double-digested vector DNA was 
dephosphorylated with shrimp alkaline phosphatase (Roche), separated on a 1% TAE- 

30 agarose gel, and gel purified. 
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The nucleic acid encoding the M. elsdenii E2a polypeptide was amplified from 
genomic DNA using the PCR primer pair EIIaAPAF and EIIariSALR. EIIaAPAF was 
designed to introduce an Apa I restriction site and a translation initiation site at the 
beginning of the amplified fragment. The EIIaSALR primer was designed to introduce a 

5 Sal I restriction site at the end of the amplified fragment and to contain an altered 

translational stop codon to allow in-frame translation of the ts-myc epitope. The PCR mix 
contained the following: IX Expand PCR buffer, 100 ng M elsdenii genomic DNA, 0.2 
HM of each primer, 0.2 mM each dNTP, and 5.25 units of Expand DNA Polymerase in a 
final volume of 100 ^iL. The PCR reaction was performed in an MJ Research PTC 100 

10 under the following conditions: an initial denaturation at 94°C for 1 minute; 8 cycles of 
94°C for 30 seconds, 55°C for 1 minute, and 72°C for 3 minutes; 24 cycles of 94°C for 30 
seconds, 62*Cfor I minute; an<t72°(^fcrf mm 7 minutes 

at 72°C. The amplification product was then separated by gel electrophoresis using a 1% 
-qFAE-agarese^eV^d^ ^ fragment 

1 5 was digested with Apa I restriction enzyme, purified with a QIAquick PCR Purification 
Column, digested with Sal I restriction enzyme, purified again with a QIAquick PCR 
Purification Column, and quantified on a minigel. 

80 ng of the digested PCR product containing the nucleic acid encoding the M. 
€faJg^?7-E2cupolypqptide-ancL&0 ng of the pr ^are<ipESC»Leu-vectQiuwere ligated using 

20 T4 DNA ligase at 1 6°C for 1 6 hours. One nL of the ligation reaction was used to 

electroporate 40 fiL of K coli Electromax™ DH1 0B™ cells. The electroporated cells 
were plated:onto. LB.C media, Jndiyidudjy^^^ with 
the EIIaAPAF and EIIaSALR primers. Individual colonies were suspended in about 25 
|xl of 10 mM Tris, and 2 yL of the suspension was plated on LBC media. The remnant 

25 suspension was heated for 1 0 minutes at 95°C to break open the bacterial cells, and 2 pL 
of the heated cells used in a 25 \iL PCR reaction. The PCR mix contained the following: 
IX Taq buffer, 0.2 \xM each primer, 0.2 mM each dNTP, and 1 unit of Taq DNA 
polymerase per reaction. The PCR program used was the same as described above for 
amplification of the gene from genomic DNA. Plasmid DNA was isolated from cultures 

30 of colonies having the desired insert and was sequenced to confirm the lack of nucleotide 
errors from PCR- 
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Plasmid DNA of a pESC-Leu/EIIa vector with confirmed sequence was used for 
insertion of the nucleic acid encoding the hi ehdenii E2p polypeptide. Three ug of 
plasmid DNA was digested with the restriction enzyme Spe I, and the digest was purified 
using a QIAquick PCR Purification Column. The vector DNA was then digested with the 
5 restriction enzyme Not I and gel purified from a 1% TAE-agarose gel. The double- 
digested vector DNA Was then dephosphorylated with shrimp alkaline phosphatase 
(Roche) and purified with a QIAquick PCR Purification Column. 

The nucleic acid encoding the M. ehdenii E2p polypeptide was amplified from 
genomic DNA using the PCR primer pair EIipNOTF and EHfJSPER The EIipNOTF 
10 primer was designed to introduce a Not I restriction site and a translation initiation site at 
the beginning of the amplified fragment The EHpSPER primer was designed to 
introduce an Spe I restriction site at the end of the amplified fragment and to contain an 
altered translational stop codon to allow for in-frame translation of the FLAG epitope. 
The PCR mix contained the following: IX Expand PCR buffer, 100 ng M ehdenii 
15 genomic DNA, 0.2 uM of each primer, 0.2 mM each dNTP, and 5.25 units of Expand 
DNA Polymerase in a final volume of 100 uL. The PCR reaction was performed in an 
MJ Research PTC100 under the following conditions: an initial denaturation at 94°C for 1 
minute; 8 cycles of 94°C for 30 seconds, 55°C for 45 seconds, and 72°C for 3 minutes; 24 
cycles of 94°C for 30 seconds, 62°C for 45 seconds, and 72°C for 3 minutes; and a final 
20 extension for 7 minutes at 72°C. The amplification product was separated by gel 

electrophoresis using a 1% TAE-agarose gel, and a 1.1 Kb fragment was excised and 
purified. The purified fragment was digested with Spe I restriction enzyme, purified with 
a QIAquick PCR Purification Column, digested with Not I restriction enzyme, purified 
again with a QIAquick PCR Purification Column, and quantified on a minigel. 
25 38 ng of the digested PCR product containing the nucleic acid encoding the M. 

ehdenii E2p polypeptide and 50 ng of the prepared pESC-Leu/EIIa vector were ligated 
using T4 DNA ligase at 16°C for 16 hours. One uL of the ligation reaction was used to 
electroporate 40 uL of K coli Electromax™ DH10B™ cells. Theclectroporated cells 
were plated onto LBC plates. Individual colonies were screened using colony PCR with 
30 the EIipNOTF and EIIjJSPER primers. Individual colonies were suspended in about 2S 
uL of 10 mM Tris, and 2 uL of the suspension was plated on LBC media. The remnant 
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suspension was heated for 10 minutes at 95°C to break open the bacterial cells, and 2 jiL 
of the heated cells was used in a 25 uL PCR reaction. The PCR mix contained the 
following: IX Taq buffer, 0.2 uM-each primer, 0.2 mlvUach dNTP, and 1 unit of Taq 
DNA polymerase per reaction. The PCR program used was the same as described above 
5 for amplification of the gene from genomic DNA. 

Plasmid DNA was isolated from cultures of colonies having the desired insert and 
was sequenced to confirm the lack of nucleotide errors from PCR. A pESC-Leu/EHa 
/EIip construct with a confirmed sequence was co-transformed along with the pESC- 
Trp/OS19/EI vector into S. cerevisiae strain YPH500 using aFrozen-EZ Yeast 
10 Transformation II™ Kit (Zymo Research, Orange, CA). Transformation reactions were 
plated on SC-Trp-Leu media. Individual yeast colonies were screened for the presence of 
the OS19, El, E2a, and E20 nucleic acid by colony PCR. Colonies were suspended in 20 
uL of Y-Lysis Buffer (Zymo Research) containing-5 units of zymolase and heated at 
37°C for 10 minutes. Three uL of mis suspension was- then used in a 25 uL PCR 
15 reaction using the PCR reaction mixtures and programs described for the colony screens 
of the E. coli transformants. The pESC-Trp/OS19 and pESC-Leu vectors were also co- 
transformed intoYPHSOO for use as a lactyl-CoA dehydratase assay control. These 
transformants .were colony screened using the GAL 1 and GAL10 primers (Instruction 
manual, pESC Yeast Epitope Tagging Vectors, Stratagene). 

20 

D. Construction of toe oESC-His/D-LDH/PCT vector 

Three ug of DNA of toe vector pESC-His was digested with the restriction 
enzyme Xho I, and the digest was purified using a QIAquick PCR Purification Column. 
The vector DNA was then digested with the restriction enzyme Apa I and gel purified 
25 from a 1 % TAE-agarose gel. The double-digested vector DNA was dephosphorylated 
with shrimp alkaline phosphatase (Roche) and purified using a QIAquick PCR 
Purification Column. 

The E. coli D-LDH gene was amplified from genomic DNA of strain DH10B 
using the PCR primer pair LDHAPAF and LDHXHOR. LDHAPAF was designed to 
30 introduce an Apa I restriction site and a translation initiation site at the beginning of the 
amplified fragment The LDHXHOR primer was designed to introduce an Xho I 
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restriction site at the end of the amplified fragment and to contain the translational stop 
codon for the D-LDH gene. The PCR mix contained the following: IX Expand PCR 
buffer, 100 ng £ coli genomic DNA, 0-2 uM of each primer, 0.2 mM each dNTP, and 
5.25 units of Expand DNA Polymerase in a final volume of 100 uL. The PCR reaction 

5 was performed in an MJ Research PTC1 00 under the following conditions: an initial 
denaturation at 94°C for 1 minute; 8 cycles of 94°C for 30 seconds, 59°C for 45 seconds, 
and 72°C for 2 minutes; 24 cycles of 94°C for 30 seconds, 64°C for 45 seconds, and 72°C 
for 2 minutes; and a final extension for 7 minutes at 72°C. The amplification product was 
separated by gel electrophoresis using a 1% TAE-agarose gel, and a 1.0 Kb fragment was 

10 excised and purified. The purified fragment was digested with Apa I restriction enzyme, 
purified with a QIAquick PCR Purification Column, digested with Xho I restriction 
enzyme, purified again with a QIAquick PCRJ^urification Column, and quantified on a 
minigel. 

80 ng of the digested PCR product containing the E. coli D-LDH gene and 80 ng 

15 of the prepared pESC-His vector were ligated using T4 DNA ligase at 16°C for 16 hours. 
One uL of the ligation reaction was used to electroporate 40 uL of E. coli Electromax™ 
DH1 0B™ cells. The electroporated cells were plated onto LBC media. Individual 
colonies were screened using colony PCR with the LDHAPAF and LDHXHOR primers. 
Individual colonies were suspended in about 25 uL of 10 mM Tris, and 2 uL of the 

20 suspension was plated on LBC media, the remnant suspension was heated for 1 0 

minutes at 95°C to break open the bacterial cells, and 2 yiL of the heated cells used in a 25 
uL PCR reaction. The PCR mix contained the following: IX Taq buffer, 02 uM each 
primer, 0.2 mM~each dNTP, and 1 unit of Taq DNA polymerase per reaction. ThePCR 
program used was the same as described above for amplification of the gene from 

25 genomic DNA Plasmid DNA was isolated from cultures of colonies having the desired 
insert and was sequenced to confirm tile lack of nucleotide errors from PCR 

Plasmid DNA of a pESC-His/D-LDH construct with a confirmed sequence was 
used for insertion of the nucleic acid encoding die M. elsdenii PCT polypeptide. Three ug 
of plasmid DNA was digested with the restriction enzyme Pac I, and the digest was 

30 purified using a QIAquick PCR Purification Column. The vector DNA was then digested 
with the restriction enzyme Spe I and gel purified from a 1% TAE-agarose gel. The 
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double-digested vector DNA was dephosphorylated with shrimp alkaline phosphatase 
{Roche) and purified with a QIAquick PCR Purification Column. 

The nucleic acid encoding the M elsdenii PCT polypeptide was amplified from 
genomic DNA using the PCR primer pair PCTSPEF and PCTPACR. PCTSPEF was 

5 designed to introduce an Spe I restriction site and a translation initiation site at the 

beginning of the amplified fragment The PCTPACR primer was designed to introduce a 
Pac I restriction site at the end of the amplified fragment and to contain the translational 
stop codon for the PCT gene. The PCR mix contained the following: IX Expand PCR 
buffer, 100 ng M. elsdenii genomic DNA, 0.2 of each primer, 0.2 mMeach dNTP, 

10 and 5 25 units of Expand DNA Polymerase in a final volume of 100 nL. The PCR 

reaction was performed in an MJ Research PTC 100 under the following conditions: an 
initial denaturation at 94°G for 1 minute; 8 cycles of 94°C for 30 seconds, 56°C for 45 
seconds, and 72°C for 2.5 minutes; 24 cycles of 94°C for 30 seconds, 64°C for 45 
seconds, and 72°C for 2.5 minutes; and a final extension for 7 minutes at 72°C. The 

1 5 amplification product was separated by gel electrophoresis using a 1% TAE-agarose gel, 
and a 1.55 Kb fragment was excised and purified. The purified fragment was digested 
with Pac I restriction enzyme, purified with a QIAquick PCR Purification Column, 
digested with Spe I restriction enzyme, purified again witha QIAquick PCR Purification 
Column, and quantified on a minigel. 

20 95 ng of the digested PCR product containing the nucleic acid encoding the M 

elsdenii PCT polypeptide and 75 ng of the prepared pESC-His/D-LDH vector were 
ligated using T4 DNA ligase at 16°C for 16 hours. One nL of the ligation reaction was 
used to electroporate 40 \\L of £ coli Electromax™ DH10B™ cells. Theelectroporated 
cells were plated onto LBC plates. Individual colonies were screened using colony PCR 

25 with the PCTSPEF and PCTPACR primers. Individual colonies were suspended in about 
25 jiL of 10 mM Tris, and 2 jiL of the suspension was plated on LBC media. The 
remnant suspension was heated for 10 minutes at 95°C to break open the bacterial cells, 
and 2 jiL of the heated cells used in a 25 fiL PCR reaction. The PCR mix contained the 
following: IX Taq buffer, 0.2 \M each primer, 0.2 mM each dNTP, and 1 unit of Taq 

30 DNA polymerase per reaction. Hie PCR program used was the same as described above 
for amplification of the gene from genomic DNA. 
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Plasmid DNA was isolated from cultures of colonies having the desired insert and 
was sequenced to confirm the lack of nucleotide errors from PCR. A construct With a 
confirmed sequence was transformed into 5. cerevisiae strain YPH500 using a Frozen-EZ 
Yeast Transformation II™ Kit (Zymo Research, Orange, CA). Transformation reactions 

5 were plated on SC-His media. Individual yeast colonies were screened for the presence 
of the D-LDH and PCT genes by colony PCR. Colonies were suspended in 20 nL of Y- 
Lysis Buffer {Zymo Research) containing 5 units of zymolase and heated at 37°C for 10 
minutes. Three \iL of this suspension was then used in a 25 jiL PCR reaction using the 
PCR reaction mixture and program described for the colony screen of the R coli 

1 0 transformants. The pESC-His vector was also transformed into YPH500 for use as an 

assay control, and transformants were screened by PCR using GAL1 and GAL10 primers. 

Example 13 - Expression of Enzymes in S. cerevisiae 
A. ■ Hvdratase Activity in Transformed Yeast 

15 Individual colonies carrying the pESC-Trp/OS19 construct or the pESC-Tip 

vector (negative control) were used to inoculate 5 mL cultures of SC-Trp media 
containing 2% glucose. These cultures were grown for 16 hours at 30°C and used to 
inoculate 35 mL of the same media. The subcultures were grown for 7 hours at 30°C, 
and their ODeoos were determined. A volume of cells giving an OD x volume equal to 40 

20 was pelleted, washed with SC-Trp media with no carbon source, and repelleted. The cells 
were suspended in 5 mL of SC-Trp media containing 2% galactose and used to inoculate 
a total volume of 100 mL of this media. Cultures were grown for 17.5 hours at 30°C and 
250 rpm. Cells were then pelleted, rinsed in 0.85% NaCl, and repelleted. Cell pellets (70 
mg) were suspended in 140 \xL of 50 mM TrisHCl, pH 7.5, and an equal volume (pellet 

25 plus buffer) of pie-rinsed glass beads (Sigma, 150-212 microns) was added. This mixture 
was vortexed for 30 seconds and placed on ice for 1 minute, and the vortexmg/cooling 
cycle was repeated 8 additional times. The cells were then centrifuged for 6 minutes at 
5,000g, and the supernatant was removed to a fresh tube. The beads/pellet were washed 
twice with 250 pL of buffer, centrifuged, and the supernatants joined with the first 

30 supernatant 
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An £ coli strain carrying the pETBlue-l/OS19 construct, described previously, 
was used as a positive control for hydratase assays. A culture of this strain was grown to 
saturation overnight and diluted 1 :20 the following morning in fresh LBC media. The 
culture was grown at 37°C and 250 rpm to an OD600 of 0.6, at which point it was induced 

5 with IPTG at a final concentration of 1 mM. The culture was incubated for an additional 
two hours at 37°C and 250 rpm. Cells were pelleted, washed with 0.85 % NaCl, and 
repelleted." Cells were disrupted using BugBxisto:™ Protein Extraction Reagent and 
Benzonase® (Novagen) as per manufacturer's instructions with a 20 minute incubation at 
room temperature. After centrifugation at 16,000g and 4°C, the supernatant was 

10 transferred to a new tube and used in the activity assay. 

Total protein content of cell extracts from S. cerevisiae described above were 
quantified using a microplate Bio-Rad Protein Assay (Bio-Rad, Hercules, CA). The 
OS 19 constructs (both Apa ISal I and Apa l-Kpn I~sites) in YPH500, the pESC-Trp 
negative x^ntnrtin TO in E. coli were tested for 

15 their ability to convert acrylyl-CoA to 3-hydroxypropionyl-CoA. The assay was 
conducted as previously described for the pETBlue-l/OS19 constructs in the £ coli 
Tuner strain. When cell extract of the negative control strain was added to the reaction 
mixture containing acrylyl-CoA, one dominant peak of MW. 823 was exhibited. This 
nealcxiorresponds^taacrylyl^Go^rmd ^^in^cates-diat acrylyl-CoA was not converted to any 

20 other product When cell extract of the strain carrying a pESC-Tip/OS 19 construct 

(either Apa l-Sal I or Apa l-Kpn I sites) was added to the reaction mix, the dominant peak 
shifted to=M3V-841 , which corresponds^© 3-hydn>xypropionyl-Q>A^ The reaction mix 
from the E. coli control also showed the MW 841 peak. A time course study was 
conducted for the pESC-Trp/OSl 9(Apa l-Sal I) construct, which measured the 

25 appearance of the MW 841 and MW 823 peaks after 0, 1, 3, 7, 15, 30, and 60 minutes of 
reaction tune. An increase in the 3-hydroxypropionyl^oA peak was seen over time with 
the cell extracts from both this construct and the E. coli control, whereas cell extract from 
the YPH500 strain with vector only showed a dominant acrylyl-CoA peak. . 
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R. Propionvl CoA-Transferase Activity in Transformed Yeast 

Individual colonies of & cerevisiae strain YPH500 carrying the pESC-His/D-LDH 
or pESC-His/D-LDH/PCT construct or the pESC-His vector with no insert (negative 
control) were used to inoculate 5 mL cultures of SC-His media containing 2% glucose* 

5 These cultures were grown for 16 hours at 30°C and 250 rpm and were then used to 
inoculate 35 mL of the same media. The subcultures were grown for 7 hours at 30°C, 
and their OD600S were determined. For each strain, a volume of cells giving an OD x 
volume equal to 40 was pelleted, washed with SC-His media with no carbon source, and 
repelleted. The^ells were suspended in 5 mL of SC-His media containing 2% galactose 

1 0 and used to inoculate a total volume of 100 mL of this media. Cultures were grown for 
16.75 hours at 30°C and 250 rpm. Cells were then pelleted, rinsed in 0.85% NaCl, and 
repelleted. Cell pellets (70 mg) were suspended in 140 jiL of 100 mM potassiuin 
phosphate buffer, pH 7.5, and an equal volume (pellet plus buffer) of pie-rinsed glass 
beads (Sigma, 150-212 microns) was added. This mixture was vortexed for 30 seconds 

1 5 and placed on ice for 1 minute, and the vortexing/cooling cycle was repeated 8 additional 
times. The cells were then centrifuged for 6 minutes at 5,000g, and the supernatant was 
removed to a fresh tube. The beads/pellet were washed twice with 250 pL of buffer and 
centrifuged, and die supernatants joined with the first supernatant 

An E. coli strain carrying the pETBlue-l/PCT construct, described previously, 

20 was used as a positive control for propionyl Co A transferase assays. A culture of this 
strain was grown to saturation overnight and diluted 1 :20 the following morning in fresh 
LB media containing 1 00 ng/mL of carbeniciilin. The culture was grown at 37°C and 
250 rpm to an ODmo of 0.6, at which point it was induced with IPTG at a final 
concentration of 1 mM. The culture was incubated for an additional two hours at 37°C 

25 and 250 rpm. Cells were pelleted, washed with 0.85 % NaCl, and repelleted. Cells were 
disrupted using BugBuster™ Protein Extraction Reagent and Benzonase® (Novagen) as 
per manufacturer's instructions with a 20 minute incubation at room temperature. After 
centrifugation at 16,000g and 4°C, the supernatant was transferred to a new tube and used 
in the activity assay. 

30 Total protein content of cell extracts was quantified using a microplate Bio-Rad 

Protein Assay (Bio-Rad, Hercules, CA). The D-LDH and D-LDH/PCT constructs in S. 
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cerevisiae strain YPH500, the pESC-His negative control in YPH500, and the pETBlue- 
1/PCT construct in K noli were tested for their ability to catalyze the conversion of 
propionyl-CoA and acetate to acetyl-CoA and propionate. Hie assay mixture used was 
that previously described for the pETBlue-l/PCT constructs in the E. coli Tuner strain. 

5 When 1 fig of total cell extract protein of the negative control strain or the 

YPH500/pESC-His/D-LDH strain was added to the reaction mixture, no increase in 
absorbance(0,00 to 0.00) was seen over 1 1 minutes. Increases in absorbaiice from 0.00 
to 0.04 and from 0.00 to 0.06 were seen, respectively, with 1 \x% of cell extract protein 
from the YPH500/pESC-ffis/D-LDH/PCT strain and the E. co/i/PCT strain. With 2 mg 

10 of total cell extract protein, the negative control strain and the YPH500/pESC-His/D- 

LDH strain showed an increase in absorbance from 0.00 to 0.01 over 1 1 minutes, whereas 
increases from 0.00 to 0.10 and 0.00 to 0.08 were seen, respectively, with the 
YPH500/pESC-His/ D-LDH /PCT strain and the E*coliI?Q1 strain. 

15 C. Lactyl-CoA Dehydratase Activity in Transformed Yeast 

Individual colonies of & cerevisiae strain YPH500 carrying the pESC-His/D-LDH 
or pESC-His/D-LDH/PCT construct or the pESC-His vector with no insert {negative 
control) were use d to inoculate 5 mL cultures of SC-His me^ajcontaining 4% glucose. 
jnb£Sfi_culhirajw^ hours at 30°C and used to inoculate 35 mL of SC-His 

20 media containing 2 % raffinose. The subcultures were grown for 8 hours at 30°C, and 
their ODeoos were determined. For each strain, a volume of cells giving an OD x volume 
equa l to_4P wgs pelleted, jresu^cpdegjn 10 mL of SC-Hfis i me^ wngdning 2% 
galactose, and used to inoculate a total volume of 1 00 mL of this media Cultures were 
grown for 17 hours at 30°C and 250 rpm. Cells were then pelleted, rinsed in 0.85% NaCl, 

25 and repelleted. Cell pellets (190 mg) were suspended in 380 nL of 100 mM potassium 
phosphate buffer, pH 7.5, and an equal volume (pellet plus buffer) of pre-rinsed glass 
beads {Sigma, 150-212 microns) was added. This mixture was vortexed for 30 seconds 
and placed on ice for 1 minute, and the vortexing/cooling cycle was repeated 7 additional 
times. The cells were then centrifuged for 6 minutes at 5,000 g and the supernatant was 

30 removed to a fresh tube. The beads/pellet were washed twice with 300 nL of buffer and 
centrifuged, and the supernatants joined with the first supernatant 
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An anaerobically-grown culture of E. coli strain DH1 OB was used as a positive 

control for D-LDH assays. A culture of this strain was grown to saturation overnight and 

i 

diluted 1 :20 the following morning in fresh LB media. The culture was grown 
anaerobically at 37°C for 7.5 hours. Cells were pelleted, washed with 0.85 % NaCl, and 



Benzonase® (Novagen) as per manufacturer's instructions with a 20-minute incubation at 
room temperature. After centrifiigation at 16,000g and 4°C, the supernatant was 
transferred to a new tube and used in the activity assay. 

Total protein content of cell extracts was quantified using a microplate Bio-Rad 

1 0 Protein Assay (Bio-Rad, Hercules, CA). The D-LDH and D-LDH/PCT constructs in 

YPH500, the pESC-His negative control in YPH500, and the anaerobically-grown E. -coli 
strain were tested for their .ability to catalyze the conversion of pyruvate to lactate by 
assaying the concurrent oxidation of NADH to NAD. The assay mixture contained 100 
mM potassium phosphate buffer, pH 7.5, 02 mM NADH, and 0.5.-1 .0 jig of cell extract 

15 The reaction was started by the addition of sodium pyruvate to a final concentration of 5 
mM, and the decrease in absorbance at 340 nm was measured over 10 minutes. When 0.5 
Hg of total cell extract protein of the negative control strain was added to the reaction 
mixture, a decrease in absorbance from -0.01 to -0.02 was seen over 200 seconds. A 
decrease in absorbance from -0.21 to -0.47 and -0.20 to -0.47 over 200 seconds was 

20 seen, respectively, for cell extract from the YPH500/i«SC-His/D-LDH or 

YPH500/pESC-His/D-LDH/PCT strains. 0.5 jiL (7.85 ng) of cell extract from the 
anaerobically-grown & coli strain showed a decrease in absorbance very similar to that 
for 1 ^g of cell extract of the YPH500/pESC-His/D-LDH/PCT strain. When 4 ng of cell 
extract was used, the YPH5 00/pESC-His/D-LDH/PCT strain showed a decrease in 

25 absorbance from -0. 1 8 to -0.60 over 1 0 minute, whereas the negative control strain 
showed no decrease in absorbance (-0.08 to -0.08). 

D. Demonstration of 3-HP production in & cerevisiae 

The pESC-Trp/OS19/EI, pESC-Leu/EHa/EIIB, andpESC-His/D-LDH/PCT 
30 constructs were transformed into a single strain of S. cerevisiae YPH500 using a Frozen- 
EZ Yeast Transformation II™ Kit (Zymo Research, Orange, CA). A negative control 



5 repelleted. Cells were disrupted using BugBuster 



Protein12xtraction Reagent and 



117 



WO 02/042418 



PCT/US01/43607 



strain was also developed by transformation of the pESC-Tip, pESC-Leu, and pESC-His 
vectors into a single YPH500 strain. Transformation reactions were plated on^C-Trp- 
Leu-His media. Individual yeast colonies were screened by colony PCR for the presence 
or absence of nucleic acid corresponding to each construct 

5 The strain carrying all six genes and the negative control strain were grown in 5 

mL of SC-Tip-Leu-His media containing 2% glucose. These cultures were grown for 31 
hours at 30°C, and 2 mL was used to inoculate 50 mL of the same media. The 
subcultures were grown for 1 9 hours at 30°C, and their OD600s were determined. For 
each strain, a volume of cells giving an OD x volume equal to 100 was pelleted, washed 

10 with SC-Trp-Leu-His media with no carbon source, and repelleted. The cells were 

suspended in 10 mL of SC-Trp-Leu-His media containing 2% galactose and 2% raffinose 
and used to inoculate a total volume of 250 mL of this media The cultures were grown 
in bottles at 30°C with no shaking, and samples were taken at 0, 4.5; 20, 28.5, 45, and 70 
hours. Samples were spun down to remove cells and the supernatant was filtered using 

15 0.45 micron Acrodisc Syrige Filters (Pall Gelman Laboratory, Ann Arbor, MS). 

1 00 microliters of the filtered broth was used to derive CoA esters of any lactate 
or 3-HP in the broth using 6 micrograms of purified propionyl-CoA transferase, 50 raM 
potassium phosphate buffer (pH 7*0), and J mM acetyl-CoA. There^onwas allowed 
to proceed at room temperature for 15 minutes and was stopped by adding 1 volume 10% 

20 trifluoroacetic acid. The reaction mixtures were purified using Sep Pak CI 8 columns as 
previously described and analyzed by LC/MS. 

Example 14 Constructing a Biosvnthetic Pathway that 
Produces Organic Acids from B-alanine 
25 One possible pathway to 3-HP from p-alanine involves the use of a polypeptide 

having CoA transferase activity (e.g., an enzyme from a class of enzymes that transfers a 
CoA group from one metabolite to the other). As shown in Figure 54, p-alanine can be 
converted to p-alanyl<toA using a polypeptide having CoA transferase activity and CoA 
donors such as acetyl-CoA or propionyl-CoA. Alternatively, p-alanyl-CoA can be 
30 generated by the action of a polypeptide having CoA synthetase activity. The p-alanyl- 
CoA can be deaminated to form acrylyl-CoA by a polypeptide having 0-alanyl-CoA 
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ammonia lyase activity. The hydration of acrylyl-CoA at the p position to yield 3-HP- 
CoA can be carried out by a polypeptide having 3-HP-CoA dehydratase activity. The 3- 
HP-Co A can act as a Co A donor for f-alanine, a reaction that can be catalyzed a 
polypeptide having CoA transferase activity, thus yielding 3-HP as a product 
5 Alternatively, 3-HP^CoA can be hydrolyzed to yield 3-HP by a polypeptide having 
specific CoA hydrolase activity. 

Methods for isolating, sequencing, expressing, and testing the activity of a 
polypeptide having CoA transferase activity are described herein. 

10 A. Isolation of a polypeptide having 8-alanvl-CoA Ammonia Lyase Activity 
Polypeptides having P-alanyl-CoA ammonia lyase activity can catalyze the 
conversion of p-alanyl-CoA into acryly-CoA. The activity of such polypeptides has been 
described by Vagelos et a/.{J. BioL Chem., 234:490-497 (1959)) in Clostridum 
propionicum. This polypeptide can be used as part of the acryiate pathway in Clostridum 

1 5 propionicum to produce propionic acid. 

C propionicum v/as grown at 37°C in an anoxic medium containing 0.2% yeast 
extract, 0.2% trypticase peptone, 0.05% cysteine, 0.5% b-alanine, 0.4% VRB-salts, 5 mM 
potassium phosphate, pH 7.0. The cells were harvested after 12 hours and washed twice 
with 50 mM potassium phosphate (Kpi), pH 7.0. About 2 g of wet packed cells were re- 

20 suspended in 40 mL of Kpi, pH 7.0, ImM MgCfe, 1 mM EDTA, and 1 mM DTT (Buffer 
A), and homogenized by sonication at about 85-100 W power using a 3mm tip (Branson 
sonifier 250). Cell debris was removed by centrifugation at 100,000g for 45 minutes in a 
Centricon T-1080 Ultra centrifuge, and the cell free extract (-110 U/mg activity) was 
subjected to anion exchange chromatography on Source- 1 5Q-material. The Source- 15Q 

25 column was loaded with 32 mL of cell free extract The column was developed by a 

linear gradient of 0 M to 0.5 M NaCl within 10 column volumes. The polypeptide eluted 
between 70-1 10 mM NaCl. 

The solution was adjusted to a final concentration of 1 M (NH^SO^ and applied 
onto a Resource-Phe column equilibrated with 1 M (Nft^SC^ in buffer A. The 

30 polypeptide did not bind to this column. 
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The final preparation was obtained after concentration in an Amicon chamber 
(filter cut-off 30 kDa). The functional polypeptide is composed of four polypeptide sub- 
units, each haying a molecular mass of 16 kDa. The polypeptide had a final specific 
activity of 1033 U/mg in the standard assay (see below). 

5 The polypeptide sample after every purification step was separated on a 1 5% 

SDS-PAGE gel. The gel was stained with 0.1% Coomassie R 250, and the destaining 
was achieved by using 7.1% acetic acid/5% ethanol solution. 

The polypeptide was desalted by RP-HPLC and subjected to N-terminal 
sequencing by gas phase Edman degradation. Hie results of this analysis yielded a 35 

10 amino acid N-tenninal sequence of the polypeptide. The sequence was as follows: MV- 
GKKVVHHLMMSAKDAHYTGNLVNGAWW (SEQ ID NO: 177). 

B. Amplification of a Gene Fragment 

The 35 amino acid sequence of the polypeptide having p-alanine-CoA ammonia 
15 lyase activity was used to design primers with which to amplify the corresponding DNA 
from genome of C. propionicium. Genomic DNA from C. propionicum was isolated 
using the Gentra Genomic DNA isolation Kit (Gentra Systems, Minneapolis) following 
the genomic DNA protocol for gram-positive bacteria. A codon usage table for 
. Clostridium propionicum was used to back translate die seven amino acids on either end 
20 of the amino acid sequence to obtain 20-nucleotide degenerate primers: 

ACLF: 5'-ATGGTWGGYAARAARGTWGT -3' (SEQ ID NO:178) 
ACLR: 5'- TCRCCCCAYTGRTTWACRAT -3*(SEQ fl) NO:179) 
The primers were used in a 50 nL PCR reaction containing IX Taq PCR buffer, 
0.6 pM each primer, 0.2 mM each dNTP, 2 units of Taq DNA polymerase (Roche 
25 Molecular Biochemicals, Indianapolis, IN), 2.5% (v/v) DMSO, and 100 ng of genomic 
DNA. PCR was conducted using a touchdown PCR program with 4 cycles at an 
annealing temperature of 58°C, 4 cycles at 56°C, 4 cycles at 54°C, and 24 cycles at 52°C 
Each cycle used an initial 30 second denaturing step at 94°C and a 125 minute extension 
at 72°C, and the program had an initial denaturation step at94°C for 2 minutes and final 
30 extension at 72°CTor 5 minutes. The amounts of PCR primer used in the reaction were 
increased three-fold above typical PCR amounts due to the amount of degeneracy in die 
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3 • end of the primer. In addition, separate PCR reactions containing each individual 
primer were made to identify PCR product resulting from single degenerate primers. 
Twenty *iL of each PCR product was separated on a 2.0% TAE (Tris-acetate-EDTA)- 
agarose gel. 

5 A band of about 1 00 bp was produced by the reaction containing both the forward 

and reverse primers, but was not present in the individual forward and reverse primer 
control reactions. This fragment was excised and purified using a QIAquick Gel 
Extraction Kit (Qiagen, Valencia, CA). Four microliters of the purified band was ligated 
into pCRH-TOPO vector and transformed by a heat-shock method into TOP 10 K volt 

10 cells using a TOPO cloning procedure (Invitrogen, Carlsbad, CA). Transformations 

were plated on LB media containing 50 |ig/mL of kanamycin and 50 ng/mL of 5-Bromo- 
4-Chloro-3-Indolyl-B-D-Galactopyranoside {X-gal). Individual, white colonies were 
resuspended in 25 pL of 10 mM Tris and heated for 10 minutes at 95°C to break open the 
bacterial cells. Two microliters of the heated cells were used in a 25 pL PCR reaction 

15 using Ml 3R and Ml 3F universal primers homologous to the pCRII-TOPO vector. The 
PCR mix contained the following: IX Taq PCR buffer, 0.2 pM each primer, 0.2 mM each 
dNTP, and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was 
performed in a MJ Research PTC100 under the following conditions: an initial 
denaturation at 94°C for 2 minutes; 30 cycles of 94°C for 30 seconds, 52°C for 1 minute, 

20 and 72°C for 1 .25 minutes; and a final extension for 7 minutes at 72°C. Plasmid DNA 
was obtained (QIAprep Spin Miniprep Kit, Qiagen) from cultures of colonies showing the 
desired insert and was used for DNA sequencing with M13R universal primer. The 
following nucleic acid sequence was internal to the degenerate primers and corresponds . 
to a portion of the 35 amino acid residue sequence: 5'-ACATCATTTAATGATGA- 

25 GCX3C AAAAGATGCTC ACTATACTGG AAACTTAGTAAACGGCGCT AG A-3 * 
(SEQIDNO:180). 

C. Genome Walking to Obtain the Complete Coding Sequence 

Primers for conducting genome walking in both upstream and downstream 
30 directions were designed using the portion of the nucleic acid sequence that was internal 
to the degenerate primers. The primer sequences were as follows: 
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ACLGSP1F: S'-GTACATCATTTAATGATGAGGGCAAAAGATG-S' (SEQ ID 
N0:181) 

ACLGSP2F:. S'-GATGCTCACTATACTGGAAACTTAGTAAAC-S'tSEQ ID 
NO:182) 

5 ACLGSP1R: 5 , -ATTCTAGCGGCGTTTACTAAGTTTeCAG-3 , <SEQ ID NO:183) 
ACLGSP2R: 5 • -CC AGTAT AGTG AGC ATCTTTTGGGCTC ATC-3 9 (SEQ ID NO:184) 

GSP1F and GSP2F are primers facing downstream, GSP1R and GSP2R are 
primers facing upstream, and GSP2F and GSP2R are primers nested inside GSP1F and 

10 GSP1R, respectively. Genome walking libraries were constructed according to the 

manual for CLONTECH's Universal Genome Walking KitXCLONTECH Laboratories, 
Palo Alto, CA), with the exception that the restriction enzymes Ssp I and Hinc U were 
used in addition to Dra I, EcoR V, and Pvu EL PGR was conducted in a Perkin Elmer 
9700 Thermocycler using the following reaction mix: IX XL Buffer H, 0.2 mM each 

1 5 dNTP, 1 .25 mM Mg(OAc>2 , 0.2 each primer, 2 units of rTth DNA polymerase XL 
(Applied Biosystems, Foster City, C A), and 1 nL of library per 50 pL reaction. First 
round PCR used an initial denaturation at 94°C for 5 seconds; 7 cycles consisting of 2 sec 
at 94°C and 3 minjit 70°C; 32icycles consisting of 2sec at 94°C and 3 min at 64°C; and a 
final extension at 64°C for 4 min. Second round PCR used an initial denaturation at 94°C 

20 for 15 seconds; 5 cycles consisting of 5 sec at 94°C and 3 min at 70°C; 26 cycles 

consisting of 5 sec at 94°C and 3 min at 64°C; and a final extension at 66°C for 7 min. 
Twenty pL of each first and second round product was run on a 1.0% TAE-agarose gel. 
In the second round PCR for the forward reactions, a 1 .4 Kb band was obtained for Dra I, 
a 1 5 Kb band for Hinc II, a 4.0 Kb band for Pvu II, and 2,0 and 2.6 Kb bands were 

25 obtained for Ssp I. In the second round PCR for the reverse reactions, a 1 .5 Kb band was 
obtained for Dra I, a 0.8 Kb band for EcoR V, a 2.0 Kb band for Hinc II, a 2.9 Kb band 
for frail, and a 1.5 Kb band was obtained for Kspl. Several of these fragments were gel 
purified, cloned, and sequenced. 

The coding sequence of the polypeptide having |3-alanyl-CoA ammonia lyase 

30 activity is set forth in SEQ ID NO:162. This coding sequence encodes the amino acid 
sequence set forth in SEQ ID NO: 160. The -coding sequence was cloned and expressed in 
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bacterial cells. A polypeptide with the expected size was isolated and tested for 
enzymatic activity. 

The isolation of a nucleic acid molecule encoding a polypeptide having 3-HPi- 
CoA dehydratase activity <e.g., the seventh enzymatic activity in Figure 54, which can be 

5 accomplished with a polypeptide having the amino acid sequence set forth in SEQ ID 
NO:41) is described herein. This polypeptide in combination with a polypeptide having 
CoA transferase activity (e.g., a polypeptide having the amino acid sequence set forth in 
SEQ ID NO:2) and a polypeptide having p-alanyl-CoA ammonia lyase activity (e.g., a 
polypeptide having the amino acid sequence set forth in SEQ ID NO: 160) provides one 

10 method of making 3 -HP from ^-alanine. 

Example 15 Constructing a Biosvnthetic Pathway that 
Produces Organic Acids from B-alanine 
In another pathway, J3-alanine generated from aspartate can be deaminated by a 



15 polypeptide having 474-aminobutyrate aminotransferase activity (Figure 55). This 
reaction also can regenerate glutamate that is consumed in the formation of aspartate. 
The deamination of JJ-alanine can yield malonate semialdehyde, which can be further 
reduced to 3-HP by a polypeptide having 3-hydroxypropionate dehydrogenase activity or 
a polypeptide having 3-hydroxyisobutyrate dehydrogenase activity. Such polypeptides 

20 can be obtained as follows. 

A Cloning gabT/4-aminobutvrate aminotransferase) from C. acetobutycilicum 

The following PCR primers were designed based on a published sequence for a 
gabT gene from Clostridium acetobutycilicum (GenBanktf AE007654): 

25 

Cac aba nco sen: S'-GAGCCATGGAAGAAATAAATGCTAAAG- 3' (SEQ ID NO:l 85) 
Cac aba bam anti: S'-AGA^ATGGCTTTTT V (SEQ ID NO:186) 

The primers introduced a Ncdl site at the 5 1 end and a BamHl site at the 3' end. A 
30 PCR reaction was set up using chromosomal DNA from C. acetobutylicum as the 
template. 
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mo 


80.75 uL 


PCR Program 


Taq Plus Long lOx Buffer 


10 uL 


94° C 5 minutes 


dNTP mix (10 mM) 


3nL 


25 cycles of. 


Cac aba nco sen (20 mM) 


2 uL 


94" C 30 seconds 


Cac aba bam anti (20 mM) 


2\£L 


50° C 30 seconds 


C. acetobutylicum DNA {-100 ng) 


l|iL 


72° C 80 seconds + 2 


Taq Plus Long <5 U/mL) 


1 pL 


seconds/cycle 


Pfu(2.5U/mL) 


0.25 uL 


1 cycle of : 




68° C 7 minutes 


i- 




4°C until use 



Upon agarose gel analysis a single band was observed of -13 Kb in size. This 
fragment was purified using QIAquick PCR purification kit {Qiagen, Valencia, C A) and 
cloned into pCRII TOPO using the TOPO Zero Blunt PCR cloning kit (Invitrogen, 

15 Carlsbad, CA). 1 pL of the pCRII TOPO ligation mix was used to transform chemically 
competent TOP10 E coli cells. The ceils were for 1 hour in SOC media, and the 
transformants were selected on LB/kanamycin {50* pg/mL) plates. Single colonies of the 
transformant grown overnight in LB/kanamycin medial and the plasmid DNA was 
extracted using a Mini prep kit (Qiagen, Valencia, CA). The super-coiled plasmid DNA 

20 was separated on a 1% agarose gel digested, and the colonies with insert were selected. 
The insert was sequenced to confirm the sequence and its quality. 

The plasmid having thexorrect-insert was digested with restriction enzyme Nco I 
and BamH L The digested insert was gel isolated and ligated to pET28b expression 
vector that was also restricted with Nco I and BamHI enzymes. 1 pi of ligation mix was 

25 used to transform chemically competent TOP10 E. coli cells. The cells were recovered 
for 1 hour in SOC media, and the transformants were selected on LB/kanamycin (50 
jig/mL) plates. The super-coiled plasmid DNA was separated on a 1% agarose gel 
digested, and the colonies with insert were selected. The plasmid with the insert was 
isolated using a Mini prep kit (Qiagen, Valencia, CA), and 1 pL of this plasmid DNA was 

30 used to transform electrocompetent BL21(DE3)<Novagen, Madison, WI). These cells 
were used to check the expression of a polypeptide having 4-aminobutyrate 
aminotransferase activity. 
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R . Cloning mmsB (3 -hvdroxvisobutvrate dehydrogenase"! from P. aeruginosa 

The following PCR primers was designed based on a published sequence for a 
mmsB gene from Pseudomona_aeruginosa (GenBank# M8491 1): 
Ppu hid nde sen: 5'-ATACATATGACCGACCGACATCGCATT-3' (SEQ ID NO:186) 
5 Ppu hid-sal ami: S'-ATAGTCGACGGGTCAGTCCTTGCGGCG-S' (SEQ ID NO:187) 



The primers introduced a Nde I site at the 5' end and a &amH I site at the 3' end. 



H 2 0 


80.75 uL 


PCR Program 


Taq Plus Long lOx Buffer 


10 uL 


94° C 5 minutes 


dNTPmix(lOmM) 


3uL 


25 cycles of: 
94° C 30 seconds 
55°C 30 seconds 
72°C 90 seconds + 2 

seconds/cycle 


Ppu hid nde sen (20 uM) 


2uL 


68°C 7 minutes 


Ppu hid sal and (20 uM) ' 


2uL 


4° C until use 


C. acetobutylicum DNA (-100 ng) 


lul 




Taq Plus Long (Stratagene, La Jolla, CA) 


luL 




Pfu (Stratagene, La Jolla, CA) 


025 pL 





A PCR reaction was set up using chromosomal DNA from P. aeruginosa as the 
10 template. Chromosomal DNA was obtained from ATCC (Manassas, VA) P. aeruginosa 
17933D. 

Upon agarose gel analysis, a single band was observed of -1 .6 Kb in size. This 
fragment was purified using QIAquick PCR purification kit (Qiagen, Valencia, CA) and 
cloned into pCRII TOPO using the TOPO Zero Blunt PCR cloning kit (Invitrogen, 
15 Carlsbad, CA). 1 |JL of the pCRII TOPO ligation mix was used to transform chemically 
competent TOP10 £ coli cells. The cells were recovered for 1 hour in SOC media, and 
the transformants were selected on LB/kanamycinX50 jig/mL) plates. Single colonies of 
the transformant grown overnight in LB/kanamycin media, and the plasmid DNA was 
extracted using a Mini prep kit (Qiagen, Valencia, CA). The super-coiled plasmid DNA 
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was separated on a 1% agarose gel and digested, and the colonies with insert wpre 
selected. The insert was sequenced to confirm the sequence and its quality. ' 

Hie plasmid having the correct insert was digested with restriction enzyme Nde I 
and BamHl. The digested insert was gel isolated and iigated to pET30a expression vector 

5 that was also restricted with Nde I and BamH I enzymes. 1 pL of ligation mix was used 
to transform chemically competent TOP10 E. coli cells. The cells were recovered for 1 
hour in SOC media, and the transformants were selected on LB/kanamyein (50 jig/mL) 
plates. The super-coiled plasmid DNA was separated on a 1% agarose gel and digested, 
and the colonies with insert were selected. The plasmid with the insert was isolated using 

10 a Mini prep kit (Qiagen, Valencia, CA), and 1 nl of this plasmid DNA was used to 

transform electrocompetent BL21(DE3) (Novagen, Madison, WI). These cells were used 
to check the expression of a polypeptide having 3-hydroxyisobutyrate dehydrogenase 
activity. 

15 OTHER EMBODIMENTS 

It is to be understood that while the invention has been described in conjunction 
with the detailed description thereof, the foregoing description is intended to illustrate and 
-not limiUhe scope of the inventioii, whichis Jefinedhy the scope of the appended claims, 
-©theraspects, advantages, andmodifications ^^thin^e~scope^f ^die^followmg 
20 tlaims. 
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WHAT IS CLAIMED IS: 

1 . A cell comprising lactyl-CoA dehydratase activity and 3-hydroxypropionyl-CoA 
dehydratase activity. 

5 

2. The cell of claim 1 , wherein said cell comprises an activity selected from the 
group consisting of El activator activity, E2 o activity, and E2 0 activity. 

v 

3. The cell of claim 1, wherein said cell comprises 3-hydroxypropionyl-CoA 
10 dehydratase activity. 

4. The cell of claim 1 , wherein said cell comprises CoA transferase activity. 

5. The cell of claim 1 , wherein said cell comprises an exogenous nucleic acid 
15 comprising: 

(a) a sequence set forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 
140, 142, 162, or 163; or 

(b) a nucleic acid sequence that shares at least 65 percent sequence identity with a 
sequence set forth in SEQ ID NO: 1, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, 

20 or 163. 

6. The cell of claim 1, wherein said cell comprises 3-hydroxypropionyl-CoA 
hydrolase activity or 3-hydroxyisobuttyl-CoA hydrolase activity; 

25 7. The cell of claim 1, wherein said cell comprises lipase activity. 

8. The cell of claim 1 , wherein said cell produces 3-HP. 

i 

9. The cell of claim 1 , wherein said cell produces an ester of 3-HP. 
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10. The cell of claim 9, wherein said ester is selected from the group consisting of 
methyl 3-hydroxypropionate, ethyl 3-hydroxypropionate, propyl 3-hydroxypropionate, 
butyl 3-hydroxypropionate, and 2-ethylhexyl 3-hydroxypropionate. 

5 11. The cell of claim 1 , wherein said cell comprises CoA synthetase activity. 

12. The cell of claim 1, wherein said cell comprises poly hydroxyacid synthase 
activity. 

10 13. The cell of claim 1 , wherein said cell produces polymerized 3-HP. 

1 4. The cell of claim 1 , wherein said cell is prokaryotic. 

1 5. The cell of claim 1 , wherein said cell is selected from the group consisting of 
15 yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

16. A cell comprising CoA synthetase activity, lactyl-CoA dehydratase activity, and 
poly hydrox yacid synthase a ctivity. _ 

20 1 7. The cell of claim 1 6, wherein said cell comprises an activity selected from the 
group consisting of El activator activity, E2 a activity, and E2 p activity. 

1 8. The cell of claim 1 6, wherein thexell produces polymerized acrylate. 

25 1 9. The ceil of claim 1 6, wherein said cell is prokaryotic. 

20. The cell of claim 16, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

30 21. A cell comprising CoA transferase activity, lactyl-CoA dehydratase activity, and 
lipase activity. 
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22. The ceil of claim 21, wherein said cell comprises an activity selected from the 
group consisting of El activator activity, E2 a activity, and E2 P activity. 

5 23. The cell of claim 21, wherein said cell produces an ester of acrylate. 

24. The cell of claim 23, wherein said ester is selected ftom the group consisting of 
methyl acrylate, ethyl acrylate, propyl acrylate, and butyl acrylate. 

10 25. The cell of claim 21, wherein said cell is prokaryotic. 

26. The cell of claim 21, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

15 27. An polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 

161; 

(b) a sequence having at least 10 contiguous amino acid residues of a sequence set 
20 forth in SEQ IDNO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161; 

(c) a sequence that has at least 65 percent sequence identity with a sequence set 
forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 161; 

(d) a sequence that has at least 65 percent sequence identity with at least 10 
contiguous amino acid residues of a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 

25 37, 39, 41, 141, 160, or 161; and 

(e) a sequence set forth in SEQ ID NO:2, 10, 18, 26, 35, 37, 39, 41, 141, 160, or 
161 that contains at least one conservative substitution. 

28. A nucleic acid molecule comprising a nucleic acid sequence that encodes the 
30 polypeptide of claim 27. 
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29. A transformed cell comprising at least one exogenous nucleic acid molecule, 
wherein said molecule comprises a nucleic acid sequence that encodes the polypeptide of 
claim 27. 

5 30. The cell of claim 29, wherein the cell produces 3-HP. 

31. The cell ofelaim 29, wherein said exogenous nucleic acid molecule encodes an 
E2 a polypeptide of an enzyme having lactyl-CoA dehydratase activity. 

10 32. The cell ofelaim 29, wherein said exogenous nucleic acid molecule encodes an 
E2 p" polypeptide of an enzyme having said lactyl-CoA dehydratase activity. 

33. The cell of claim 29, wherein said exogenous nucleic acid molecule encodes a 
polypeptide having 3-hydroxypropionyl-CoA dehydratase activity or CoA transferase 

15 activity. 

34. The cell of claim 29, whereinsaid exogenous nucleic acid molecule encodes a 
polypeptide having 3-hydroxypropionyl-CoA hydrolase activity or 3-hydroxyisobutryl- 
CoA hydrolase activity. 

20 

35. The cell of claim 29, wherein the cell comprises lipase activity. 

33. The cell of claim 29, wherein the cell produces an ester of 3-HP. 

25 37. The cell ofelaim 36, wherein said ester is selected from the group consisting of 
methyl 3-hydroxypropionate, ethyl 3-hydroxypropionate, propyl 3-hydroxypropionate, 
butyl 3-hydroxypropionate, and 2-ethylhexyl 3-hydroxypropionate. 

38. The cell of claim 29, wherein said cell comprises CoA synthetase activity. 

30 

39. The cell ofelaim 29, wherein said cell produces polymerized 3-HP. 
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40. The cell of claim 29, wherein said cell is prokaryotic. 

41 . The cell of claim 29, wherein said-cell is selected fiom the group consisting of 
5 Lactobacillus, Lactococcus, Bacillus, and Escherichia-ceMs. 

42. The cell of claim 29, wherein the celUs.a yeast cell. ....^ 

43. A specific binding agent that specifically binds to the polypeptide of claim 27. 

10 

44. An isolated nucleic acid molecule comprising a nucleic acid sequence selected 
from the group consisting of: 

(a) a sequence set forth in SEQ ID NO:l, 9;i7, 25, 33, 34, 36, 38, 40, 42, 129, 

140, 142, 162, or 163; 

15 <b) a sequence having at least 10 contiguous nucleotides of a sequence set forth in 

SEQ ED NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163; 

{c) a sequence that has at least 65 percent sequence identity with a sequence set 
forth in SEQ ID NO:l, 9, 17, 25, 33, 34^36, 38, 40, 42, 129, 140, 142, 162, or 163; 

(d) a sequence thaThas at least 65 percent sequence identity with at least 10 

20 contiguous nucleotides of a sequence set forth in SEQ ID NO: 1, 9, 17, 25, 33, 34, 36, 38, 
40, 42, 129, 140, 142, 162, or 163; and 

(e) a sequence that hybridize under moderately stringent conditions a sequence set 
forth in SEQ ID NO:l, 9, 17, 25, 33, 34, 36, 38, 40, 42, 129, 140, 142, 162, or 163. 

25 45. A production cell comprising an isolated nucleic acid molecule of claim 44 that is 
exogenous to said production cell. 

46. The cell of claim 45, wherein said isolated nucleic acid molecule encodes a 
polypeptide having an enzymatic activity selected from the group consisting of CoA 
30 transferase activity, lactyl-CoA dehydratase activity, CoA synthase activity, CoA 
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dehydratase activity, dehydrogenase activity, malonyi-CoA reductase activity, and 3- 
hydroxypropionyl-CoA dehydratase activity. 

47. A method of producing a polypeptide, comprising culturing the cell of claim 45 
5 under conditions that allow said cell to produce said polypeptide, wherein said 

polypeptide is produced. 

48. A method for making 3-HP, said method comprising culturing at least one cell 
comprising at least one exogenous nucleic acid molecule that encodes at least one 

1 0 polypeptide that is capable of producing said 3-HP from PEP under conditions such that 
said 3-HP is produced. 

49. The method of claim 48, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

15 

50. The method of claim 48, wherein 3-HP is made by a biosynthetic route that 
utilizes a ^-alanine intermediate. 

51. The method of claim 48, wherein 3-HP is made by a biosynthetic route that 
20 utilizes a malonyl-CoA intermediate. 

52. The method of claim 48, wherein 3-HP is made by a biosynthetic route that 
utilizes a lactate intermediate. 

25 53. A method for making 3-HP, said method comprising culturing at least one cell 
comprising at least one exogenous nucleic acid molecule that encodes at least one 
polypeptide that is capable of producing said 3-HP from lactate under conditions -such 
that said 3-HP is produced. 

30 54. The method of claim 53, wherein said cells are selected from the jpmrp consisting 
of yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 
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55. A method for making 3-HP, said method comprising culturing at least one cell 
under conditions wherein said cell produces said 3-HP, said cell comprising lactyl-CoA 
dehydratase activity and 3 -hydroxypropionyl-Co A dehydratase activity. 

5 

56. The method of claim 55, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

v 

57. The method of claim 55, wherein said cell comprises CoA transferase activity. 

10 

58. The method of claim 55, wherein said cell comprises 3-hydroxypropionyl-CoA 
hydrolase activity or 3-hydroxyisobutryl-CoA hydrolase activity. 

59. A method for making 3-HP, said method comprising: 

15 a) contacting lactate with a first polypeptide having CoA transferase activity to 

form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyi-CoA 
dehydratase activity to form acrylyl-Co A, 

c) contacting said acrylyl-CoA with a third polypeptide having 3- 
20 hydroxypropionyl-CoA dehydratase activity to form 3-HP-CoA, and 

d) contacting said 3-HP-CoA with said first polypeptide to form said 3-HP or with 
a fourth polypeptide having 3-hydroxypropionyl-CoA hydrolase activity or 3- 
hydroxyisobutryl-CoA hydrolase activity to form said 3-HP. 

25 60. A method for making polymerized 3-HP, said method comprising culturing a cell 
under conditions wherein said cell produces said polymerized 3-HP, said cell comprising 
lactyl-CoA dehydratase activity and 3-hydroxypropionyl-CoA dehydratase activity. 

61. The method of clainr60, wherein said cell is selected from the group consisting of 
30 yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 



133 



BNSOOCIO: <WO 02424 18A2_IA> 



WO 02/042418 



PCT/US01/43607 



62. The method of claim 60, wherein said cell-comprises Co A synthetase activity. 

63. The method of claim 60, wherein said cell comprises poly hydroxyacid synthase 
activity. 

5 

64. A method for making polymerized 3-HP, said method comprising: 

a) contacting lactate with a first polypeptide having CoA synthetase activity to 
form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
10 dehydratase activity to form acrylyl-Co A, 

c) contacting said acrylyl-CoA with a third polypeptide having 3- 
hydroxypropionyl-CoA dehydratase activity to form 3-hydroxypropionic acid-CoA, and 

d) contacting said 3-hydroxypropionic acid-CoA with a fourth polypeptide having 
pnly hydroxyacid s ynthase activity to form saidj)jQlymerized.3-HP^ ..... 

15 

65. A method for making an ester of 3-HP, said method comprising culturing a cell 
under conditions wherein said cell produces said ester, said cell comprising lactyl-CoA 
dehydratase activity and 3-hydroxypropionyl-Co A dehydratase activity. 

20 66. The method of claim 65, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

67. The method of claim 65, wherein said cell comprises CoA transferase activity. 

25 68. The method of claim 65, wherein said cell comprises 3-hydroxypropionyl-CoA 
hydrolase activity or 3-hydroxyisobutryl-CoA hydrolase activity. 

69. A method for making an ester of 3-HP, said method comprising: 

a) contacting lactate with a first polypeptide having CoA transferase activity to 
30 form lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
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dehydratase activity to form acrylyl-CoA, 

c) contacting said acrylyl-CoA with a third polypeptide having 3- 
hydroxypropionyl-CoA dehydratase activity to form 3-hydroxypropionic acid-CoA, 

d) contacting said 3-hydroxypropionic acid-CoA with said first polypeptide to 

5 form 3-HP or a fourth polypeptide having 3-hydroxypropionyl-CoA hydrolase activity or 
3-hydroxyisobutryl-CoA hydrolase activity to form 3-HP, and 

e) contacting said 3-HP with a fifth polypeptide having lipase activity to form said 

ester. 

10 70. A method for making polymerized acrylate, said method comprising culturing a 
cell under conditions wherein said cell produces said polymerized acrylate, said cell 
comprising CoA synthetase activity and lactyl-CoA dehydratase activity. 

71 . The method of claim 70, wherein said cell is selected from the group consisting of 
15 yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

72. The method of claim 70, wherein said cell comprises poly hydroxyacid synthase 
activity. 

20 73. A method for making polymerized acrylate, said method comprising: 

a) contacting lactate with a first polypeptide having CoA synthetase activity to 
form lactyl-CoA, 

b) contacting said lactyl-Co A with a second polypeptide having lactyl-CoA 
dehydratase activity to form acrylyl-CoA, and 

25 c) contacting said acrylyl-CoA with a third polypeptide having poly hydroxyacid 

synthase activity to form said polymerized acrylate. 

74. A method for making an ester of acrylate, said method comprising culturing a cell 
under conditions wherein said cell produces said ester, said cell comprising CoA 
30 transferase activity and lactyi-CoA dehydratase activity. 
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75. The method of claim 74, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

16. The method of claim 74, wherein said cell comprises lipase activity. 

5 

77. A method for making an ester of acrylate, said method comprising: 

a) contacting lactate with a first polypeptide having CoA transferase activity to 
fonn lactyl-CoA, 

b) contacting said lactyl-CoA with a second polypeptide having lactyl-CoA 
10 dehydratase activity to form acrylytCoA, 

c) contacting said acrylyl-CoA with said first polypeptide to form acrylate, and 

d) contacting said acrylate with a third polypeptide having lipase activity to fonn 
said ester. 

15 78. A method for making 3-HP, said method comprising culturing a cell under 
conditions wherein said cell produces said 3-HP, said cell comprising at least one 
exogenous nucleic acid that encodes at least one polypeptide such that said 3-HP is 
produced from acetyl-CoA and under conditions such that said 3-HP is produced. 

20 79. The method of claim 78, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

80. A method for making 3-HP, said method comprising culturing a cell under 
conditions wherein said cell produces said 3-HP, said cell comprising at least one 

25 exogenous nudeic acid that encodes at least one polypeptide such that said 3-HP is 
produced from malonyl-CoA and under conditions such that said 3-HP is produced. 

81. The method of claim 80, wherein said cell is selected from the group consisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

30 

82. A method for making 3-HP, said method comprising culturing a cell under 
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conditions wherein said cell produces said 3-HP, said cell comprising at least one 
exogenous nucleic acid that encodes at least one polypeptide such that said 3-HP is 
produced from p-alanine and under conditions such that said 3-HP is produced. 

5 83. The method of claim 82, wherein said cell is selected from the groupxonsisting of 
yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

84. A method for making 3-HP, said method comprising culturing cells comprising an 
exogenous nucleic acid that encodes polypeptides that are capable of producing 3-HP 

1 0 from acetyl-Co A under conditions such that said 3-HP is produced. 

85. The method of claim 84, wherein said cells are selected from the group consisting 
of yeast, Lactobacillus, Lactococcus, Bacillus, and Escherichia cells. 

15 86. ~A methodfor making 3-HP, said method comprising culturing cells comprising at 
least one exogenous nucleic acid that encodes polypeptides that are capable of producing 
said 3-HP from malonyl-CoA, and under conditions such that said 3-HP is produced. 

87. The method of claim 86, wherein said cells are selected from the group consisting 
20 of yeast, Lactobacillus^ Lactococcus, Bacillus, and Escherichia cells. 

88. A method for making 3-HP, said method comprising: 

a) contacting acetyl-CoA with a first polypeptide having acetyl-CoA carboxylase 
activity to form malonyl-CoA, and 
25 b) contacting said malonyl-Co A with a second polypeptide having malonyi-Co A 

reductase activity to form said 3-HP. 

89. A method for making 3-HP, said method comprising contacting malonyi-CoA 
with a polypeptide having malonyl-CoA reductase activity to form said 3-HP. 

30 

90. A method for making 3-HP, said method comprising: 
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l 

a) contacting p-alanine CoA with a first polypeptide having P-alanyl-CqA 
ammonia lyase activity to form acrylyl-Co A; 

b) contacting said acrylyl-CoA with a second polypeptide having 3HP-CoA 
dehydratase activity to form said 3-HP-CoA; and 

5 c) contacting 3-HP-CoA with a third polypeptide having glutamate dehydrogenase 

to make 3-HP. 
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Figure 6 

ATGAGAAAAGTAGAAATCATTACAGCTGAACAAGCAGCTCAGCTGGTAAAAGACJUVCGAC 

•ACGATTACGTCTATC<3GC^TGTCAGCAGCGCeCATCCGGAAGCACTGA.CCAAA<3CTTTG 

GAAAAACGGTTCCTGGACACGAACACCCCGCAGAACTTGACCTACATCTATGCAGQCTCT 

CAGGGCAAACGCGATGGCCGTGTCGCTGAACATCTGGCACACACAGXaCCTTTTGAAACGC 

GCCATCATCGGTCACTGGCAGACTGTACCGGCTATCGGTAAACTGGCTGTCGAAAACAAG 

ATTGAAGCTTACMCTTCTCGCAGGGCACGTTGGTCCACTGGTTGCGCGCCTTGGCAGGT 

CATAAGCTCGGCGTCTTCACCGACATCGGTCTGGAAACTTTCCTCGATCCCCGTCAGCTC 

GGCGGCAAGCTCAATGACGTAACCAAAGAAGACCTCGTCAAACTGATCGAAGTCGATOGT 

CA^GAACAGCTTTTCTACCCGACCTTCCCGGTCAACGTAGCTTTCCTCCGCGGTACGTAT 

GCTGATGAATCCGGCAATATCACCATGGACGAAGAAATCGGGCCTTTCGAAAGCACTTCC 

GTAGCC^GGCCGTTCACAACTGTGGCGGTAAAGTCGTCGTCCAGCTCAAAGACGTCGTC 

GCTCACGGCAGCCTCGACCCGCGCATGGTCAAGATCCCTGGCATCTATGTCGACTACGTC 

GTGGTAGCAGCTCCGGAAGACCATCAGCAGACGTATGACTGCGAATACGATCCGTCCCTC 

AGCGGTGAACATCGTGCTCCTGAAGGCGCTACCGATGCAGCTCTCCCCATGAGCGCTAAG 

AAAATCATCGGCCGCCGCGGCGCTTTGGAATTGACTGAAAACGCTGTCGTCAACCTeGGC 

GTCGGTGCTCCGGAATACGTTGCTTCT GTTGCCGGTG AAGAAGGT ATCGCCGAT ACCATT 

ACCCTGACCGTCGAAGGTGGCGCCATCGGTGGCGTACCGCAGGGCGGIXK^CCGCTTCGGT 

TCGTCCCGCAATGCCGATGCCATCATCGAGCACACCTATCAGTTCGACTTCTACGATGGC 

GGCGGTCTGGACATCGCTTACCTCGGCCTGGCCCAGTGCGATGGCTCGGGCAACA 

GTCAGCAAGTTCGCTACTAACGTTGCCGGCTGCGGQGGTTTCCCC^ 

acac^tgtSacttctg^^ 

r.M^TCACTTTCAACGGTTCCTATGCAGCCCGCAACGGCAAACACGTTCTCTACATCACA 

^gctogta^gaactgac^ 

A^GA^TTCAAAAAGATATCCTCGCTCACATGC^CT 

ScraTCcWccGccTCT (SEQ 

ID NO:l) 
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Figure 7 

MRKVEI ITAEQAAQLVKDNDT ITS I GFVS SAHPEALTKALEKRFLDTNTPQNLTYI YAGS 
QGKRDGRAAEHLAHTGLLKRAI IGHWQTVPAIGKLAVENKIEAYN FS QGTLVHWFRALAG 
HKLGVFTOIGLETFLDPRQLGGKLNDVTKEDLVKLIEVTC 

AraSGNITMDEEI<3PreSTSVA<JAVHN03GKVWQVKDWAHGSLDPRMVKIPGIYVDY^ 
WAAPE DHQQT YDCEYDPS LS GEHRAPEGATDAAL PMS AKKI IGRRGALELTEN AWNLG 
VGAPEYVAS VAGEEGIADTITLTVEGGAIGGVPQGGARPGSSRNADAII DHTYQFDFYDG 
GGLDIAYLGLAOCDGSGNINVSKFGTNVAGCGGFPNISQQTPNVYFCGTFTAGGLKIAVE 
DGKVKILQEGKAKKFIKAVDQITFNGSYAAIWGKHVLYITERC^FELTKEGLKLIEVAPG 
I DI EKDI LAHMDFKP II DN PKLMDARLFQDGPMGLKK (SEQ ID NO:2) 
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Figure 8 

SEQ ID N0:1 1 atgagaaaagtagaaatcattacagctgaacaagcag^-agctcg^ 

sed id NO -3 X — -~ gtgccggtcctgtcggcacaggaagcggtga-attatatt 

SEO ID NO* 4 I atgccgattctctcaaaaatatgggcggctccagcagctggaatcttgag 

SBQ ID NO:5 1 atgaa_ — tgca 

seo ID NO-1 49 aaagacaacgacacgattacgtctatcggctttgtcagcagcgcccatcc 
£ S No':3 40 cccgacgaagcaacactttgtgtgttaggcgctg---gcggcggtattct 
SEQ ID HO:4 51 aaalacUgagaaatgctcatcaaatgaggctaatrtcaatga-catcc 
SEQ ID NO: 5 10 aaaga ; atta atcg _ 



SEO ID NO: 1 99 ggaagcactgaccaaagctttggaaaaacggttcctg 

id No! 3 87 ggaag --ccaccacgtt-aattactgctcttgctgataaatataa 

SEQ ID NO: 4 100 tcgatgaaagcaaaagtcttt a actctgc- - — - 

SEQ ID NO:5 23 [ ~~ " ~~ 

<5eo ID NO-1 136 gacacgaacaccccgcagaacttgacctacatctatgcag-gctctc 

IS ID NO*: 3 129 acagactcaaacaccacgt--aatttatcgattattagtccaa-cagggc 

SEQ ID NO:4 129 cgaagaagccgtgaaggatattccagat-aatgcaaagctttt 

SEQ ID NO:S 23 ctcgccgaatt " 

SEO ID NO:l 182 agggcaaacgcgatggccgtgccgctgaacatctggcacacacaggcctt 

112 ID NO-3 176 t?|gcgatcgcgccgaccgtggtattagtcctctggcgcaagaa,gtctg 

SEQ ID NO: 4 171 a~------^-gttggc- .gg C ttcggac tatgcgg-aatcccagaaaat 

SEQ ID NO:3 34 — gcgatgg - 

sbo ID NO:l 232 ttgaaacge^ccateatcggtcactggcagartgtaccggc-tateggta 

s" H Si3 226 gtgaaatlgicattatgtggtcactgg-ggacaatcgccgcgtatttctg 

SEO ID NO: 4 208 cteatccaagctatca-caaaaactggtcaa " 7? 1?Z 

SBQ ID Sis 41 aattacatgatgga— ga-f ttgtta 

SEO ID NO:l 281 aactggctgtcgaaaacaagattgaagcttacaacttctcgcagggcacg 

fra ID NO 3 275 aactcgcagaacaaaataaaattattgcttataactacccacaaggtgta 

3 ID S.'t HI ttaca&gtatcaaa«-aa^^ 

SEQ ID NO: 5 65 atctcggt — : attg—gtttac— caacacagg 

SEQ ID NO:l 331 ttggtccactggttccgcgccttggcaggtcataag^gcgtcttcac 

SEQ ID NO: 3 325 cttacacaaaccttacgcgccgccgcagcccaccagcxtggtattattag 

S id NO 4 287 ttggcttgctccttc~aaactcgacaaatc~aagaaaatgatctcatc 

S2 » NO:S 92 ttg?----taattatttacctgataatgtcaat. ttae 

sbo ID NO-1 381 cgacatcggtct ggaaa ctttcctcgatccccgtcagctcggc 

£ H 22 375 SatatSfcal— c***— ""^^^Sgl 

seo ID NO: 4 333 gtacgtcggtgaaaacggaga atttgctcga caatatcccagc 

£ S N0-.5 126 --acttclatca gaaaatggctttcttggtttaactgca 

seo ID NO-1 424 ggcaagctcaatgacgtaacca aagaagacctcgtcaaactgat 

S i2 S S 3 418 Scaalctgaatgaagtcacta -aagaagacctgattaaactggt 

£ ID S:4 376 S^agctW^attca^^ 

SBQ ID NO:S 163 tttgac cca gaaaatgctaattcaaact 

seo ID NO:l 468 cgaagtcgatggtca — tgaacagcttttctacccgacc - 

IS lt> NO 3 462 claglttlataacaa— -agaatatctctattacaaagcg -~ 

£ id E-3 426 tcrtccagctggtgccggtgttcccgcattctacac-ac caacaggatac 
SEQ ID NO: 5 191 — tagtaaatgctgg tggtcagcctt 
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SEQ ID NO:l 
SEQ ID N0:3 
SEQ ID N0:4 
SEQ ID N0:5 

SEQ ID N0:1 
SEQ ID NO: 3 
SEQ ID WO: 4 
SEQ ID NO: 5 

SEQ ID NO:l 
SEQ ID NO: 3 
S&Q— ID- tiQ:-4— - 
SEQ ID NO: 5 

SEQ ID NO:l 
SEQ ID NO: 3 
• SEQ ID NO:4 
SEQ ID NO: 5 

SEQ ID NO: 1 
SEQ ID NO: 3 
SEQ ID NO: 4 
SEQ ID NO: 5 

SEQ ID NO:l 
SEQ ID NO: 3 
SEQ ID NO: 4 
SEQ ID NO:5 

SEQ ID NO:l 
SEQ ID NO:3 
SEQ ID NO: 4 
SEQ ID NO:5 

SEQ ID NO:l 
SEQ ID NO: 3 
SEQ ID NO: 4 
SEQ ID NO: 5 

SEQ ID NO:l 
SEQ ID NO-.3 
SEQ ID NO:4 
SEQ ID NO: 5 

SEQ ZD NO:l 
SEQ ID NO:3 
SEQ ID NO:4 
SEQ ID NO:5 

SEQ ID NO:l 
SEQ ID NO: 3 
SEQ ID NO: 4 
SEQ ID NO:5. 

SEQ ID NO:l 
SEQ ID NO:3 
SEQ ID NO: 4 
SEQ ID NO: 5 



505 --ttcccgg— tcaacgtagctttcctccgcggtacgtatgctga tg 

499 — a t tgcgc— cagatattgccttcattcgcgctaccacctgcga ca 

475 ggtacccagattcaagaaggaggtgctccga-ttaagtacagtaaaactg 
215 • — gtggaa ttaa aa 

548 aatccggcaatatc-accatggacg aagaaatcgggcctttc 

542 gtgaaggctacgcc-acttttgaag— — — atgaggtgatgtatctc 
524 aaaaaggaaagattgaagttgcaagtaaagcgaaagaaacacgacaattc 
227 aaggcggctcta « ; ctttt 

38 9 ga aagcacttccgta gcccaggccgttcac— aactgtggcggt 

583 ga— cgcattggttattgcccaggcggtgcac — aataacggcggt 

_^74~aatggaattaattatgtaatggaagaggctatttggggagattttgcatt 
244 ga tagtgctt t— ttctttcgcttt 

631 aaagtcgtcgtccaggtcaaagacgtcgtcgc— - tcacggcagoctc 

625 attgtgatgatgcaggtgcagaaaatggttaa gaaagccacgctg 

€24 gatcaaggcgtggagagcagatac-tcttggaaatattcaattcagacat 
267 aa : : 1 ! ttc 



61 6 gacccgcgca tgg tcaagatccctg- 
670 catcctaaatctgtccgtattccgg- 



-gcatctatgtcgactac 
-ttatctggtggat 



673 gctgctggaaatttcaataatccaatgtgcaaagcctctaaatgcac — c 
272 gtggcggtcatgtt— -gatgcctg— — tgtgctaggtggact— 

718 g tcgtcgtagcagc tccggaagaccatcagcag— acgta tgactgcgaa 

709 attgtggtggtcgatccg gatcaaacccaa— ctgtatggcggtgca 

721 atcgtcgaagtag aggaaatcgtcgaaccgggagtaattgctccaaa 

309 ; : 

766 t- acgatccgtccctcagcggtgaacatcgtgctcctg-aaggc 

754 c cggttaaccgctttatttctggtgacttcacccttg-atgac 

768 cgatgtgcacattccatcaatctattgtcatcgtctagttttgggaaaga 
309 7 tg-aagtt 

8 08 gctac cgatgcagc — tctccccatgagcgctaaga 

796 agtac caaacttag cctgcccctaaac-caacgt 

818 actacaaaaaaccaatcgaacggccaatgttcgcacacgaaggaccaata 
316 gatca agaagcaaa tctcgc 

842 aaatcatcggc-cgccgcggcgctttggaattgactgaaaacgctgtcgt 
829 aaattagttgcgcggcgcgcattattcgaaatgcgtaaaggcgcggtggg 
868 aaaccatctac-atcggc— tgctggaaaatcgagagaaatcattg-cag 
335 — — — - — ' »-■— -- taactgga- — — — 

691 caacctcggcgtcggtgctcc ggaat—acgttgcttctgttgcc 

879 gaatgtcggcgtcggta ttgc tgacg — gcattggcctggtcgcc 

914 cacgtgcagctttggagttcacagatggaatgtacgccaatttgggtatc 

344 — — — tggtgcc 

934 gg— tgaagaaggtatcgccga — tacca-^— —ttaccct^ac 

922 eg— agaagaaggttgtgctga tgact ttattctgac 

964 gggattccgactttggcgccaaattatataccaaatgg attta ctgttca 
351 tg— gcaaaatggta 5 

969 cgtcgaaggtg -gcgccatcggtggcgt-accgcagggcggtgcc 

957 ggtagaaacag gtccgattggcggaattacttcacaggggatcg 

1014 tttgcaaagtgagaatggtattattggagtggg-accata tcca 

364 
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SEQ ID NO:l 1012 cgcttcggttcgtcccgca-atgccgatgccatca tcgaccacacc 

SEQ ID NO: 3 1001 c-ctttggcgcgaacgtga-atacccgtgccattc* tggatatgacg 

SEQ ID NO: 4 1057 agaaaag gaacagaagacgccgatctcattaatgctggaaaagagc 

SEQ ID NO: 5 364 -ccagga-atg 

SEQ ID NO:l 10S7 tatcagttcgacttctacgatggcggc ggtctggacatcg 

SEQ ID NO: 3 1045 tcccagtttgatttttatcacggtggc ggtctggatgttt 

SEQ ID NO: 4 1103 — caattactcttct-caaaggagcttcaattgttggttctgatgaatc 

ID NO:5 373 ggcgga -gcaatggacttag 



SEQ ID NO:l 1097 cttacctcggcctgg -cccagtgfcgatg gctcgggcaac 

SEQ ID NO: 3 1085 gttatttgagttttg ctgaagtcgacc agcacggtaac 

SEQ ID NO: 4 1149 attcgcaetgattcgtggttctcatatggatattactgtgctcggtgcac 

SEQ ID NO:5 392 tg — actggtgcaa- 

SEQ ID NO:l 1135 atcaacgtcagca-agttcggtactaacgttgccggctgcggcggtttcc 

SEQ ID NO: 3 1123 gtcggcgtgcata-aattcaatggtaaaatcatgggcaccggtggattta 

SEQ ID NO: 4 1199 ttca — gtgctcacagtttgg agatttagcgaattggatgattccg 

SEQ ID NO:S 404 ; 

SEQ ID NO:l 1184 ccaacatt — tcccagcagacaccgaatgtttacttctgcggcacct-tc 

SEQ ID NO: 3 1172 ttgatatcagtgccacttcgaagaaaatcatt — ttctgcggcacat-ta 

SEQ ID NO: 4 1243 ggaaaatt ggtga-aaggaatgggcggtgcaatggatcttgtc 

SEQ ID NO: 5 404 aaaaagtgattatt ggca 

SEQ ID NO:l 1231 acggctggcggcttgaaaatcgctgtcgaagacggcaaagtcaagatcct 

SEQ ID NO: 3 1219 actgcgggcagtttaaaaacagaaattaccgacggcaaattaaatatcgt 

SEQ ID NO: 4 1285 tctgctcccgg agcccgtgt-gatcgtfcgtaatggagcatgtat 

SEQ ID NO: 5 422 ~ -tggaacattg tgccaagtcaggttcct 

SEQ ID NO:l 1281 ccaggaaggcaaagccaagaagttcatcaaagctgtcgaccagatcactt 

SEQ ID NO: 3 1269 ccaggaaggacgggtgaagaaatttattcgggaactaccggaaattactt 

SEQ ID NO: 4 1328 cgaagaacggagagccaaaaatt ctagagcactg 

SEQ ID NO: 5 449 caaaaattctaaag aaatgtacattaccgct cacagcaagt 

SEQ ID NO:l 1331 tcaacgg — ttcctatgcagc ccgcaacggcaaacacgttctct 

SEQ ID NO: 3 1319 tcagcggaaaaatcgctctcgagc gagggctgg atgttcgtt 

SEQ ID NO: 4 1362 cgaacr ttcctctga — c cggcaaagg — agtaatttoccg 

SEQ ID NO: 5 490 aaaaaag 1 tgcca tggt ggt taccgaa t tggca gtattta 

SEQ ID NO:l 1373 a — catcacagaacgctgcgtatttgaactgacca — aagaa-ggcttga 

SEQ ID NO: 3 1361 a— tatcactgagcgcgcagtattcacgctgaaag— aagac-ggcctgc 

SEQ ID NO: 4 1398 aatcattactgatatggcagttttcgacgtggacacaaagaacggattga 

SEQ ID NO: 5 530 a— cttcattgaaggcagattagttcta — a— aagaa catgc 

SEQ ID NO: I 1418 aactcatcgaagtcgcaccgggcatcgatattgaaaaagatatcctcgct 

SEQ ID NO: 3 1406 atttaatcgaaatcgcccctggcgtcgatttacaaaaagatattctcgac 

SEQ ID NO: 4 1448 cattgatcgaagt — caggaaggatc-ttactgtagatgatat— ~ 

SEQ ID NO: 5 567 tcctcat gtggatttagaaaca attaaagcc 

SEQ ID NO:l 1468 cacatggacttcaagccgat — cattgata atccga — aactcatgg 

SEQ ID NO: 3 1456 aaaatggatttcaccccagt — gatttcgccagaactca— aactgatgg 

SEQ ID NO: 4 1488 — caagaaactca — ccg cttgcaa attcga — aatttccga 

ID NO: 5 598 aaaacag— aagccgatttcattgtt- gccgatgatttcaaag 



SEQ ID NO:l 1511 atgcccgcctcttccaggacggtcccatggga ctgaaaaaa 

SEQ ID NO: 3 1502 acgaaagattatttatcgatgcggcgatgggttttgtcctgcctgaagcg 

SEQ ID NO: 4 1524 aaatctgaagccaatgggacaggctcctctta atcaaggataa- 

SEQ ID NO: 5 *38 aaatgcaaatcagccag aaagga-- — -> cttgaattatga 



SUBSTITUTE SHffiT (RULE 



WO 02/04241! 



PCT/US01/43607 



11/105 

SEQ ID NO:l 1SS2 taa 

SEQ ID NO: 3 1552 gctcattaa 

SEQ ID NO: 4 1567 

SEQ ID NO: 5 673 
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Figure 9 
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SEQ ID NO: 7 


368 


SEQ ID NO: 8 


82 


SEQ ID NO:2 


388 


SEQ ID N0:6 


384 


SEQ ID NO: 7 


416 


SEQ ID NO: 8 


121 


SEQ ID N0:2 


432 


SEQ ID NO: 6 


428 


SEQ ID NO: 7 


456 


SEQ ID NO: 8 


155 



mrkvetit — 

-- npvla— 



aeqaaqlv 

aqeavnyi 



-mnakeli- 



-arriamel 



kdndtitsigfvasahpealt— kalekrfldtntpqnltyiyagsqgkr 
pdeatlcvlg-agggileattlitaladkykqtqtprnlaiisptglgdx 
pdnakllvggfglcgipenli— qai tktgqkgltcvsnnagv- 



adrgisplaqeglvkvalcghwgqsprlselaeqnkiiaynypqgvltqt 



lraaaahqpgiiadlgigtfvdprqqgglOnevtkedliklvef- 



-igl- 



-ylpdnvnitlqsengflglta- 



-ptqwn- 



-Icc tiveveeivepgviapndvhipsiychrlvlg— knykk 



-enansnl-vn — 



-apeyva^ag^gladtJ.tltveggalg — gvpqggarfgssraad- 



-gftvhlqsengiigvgpyprkgtedadliiiagke 
— — ggqpc — gikkggstf- 



-alldhtyqfdfydgggldiaylglaqcdgsgnl-nvskfgtn 
-alldmtsqfdf>rfaq qqldvcyl8faevdqhgny^gyfakfnqk 



-dsafsfalirgghvdacvlgglevdqeanlanvravpgkm 



-kctlplt- 



- — edgkvkilqegk 
— tdgklnivqegr 
- vsapgarvi wmehvs kngepkilehce 
-kegsskllk 



-lpltgkgviariitdmavfdvdtkngltlievr 
— askkvam--wtelavfn£-iegrlvlkeha 
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SEQ ID NO:2 479 pgidiekdi--lahmdfkpiidnp-klmdarlfqdgpmglkk 

SEQ ID NO: 6 475 pgvdlqkdi— ldkmdftpvispelklmderlfidaamgfvlpeaah 

SEQ ID N0:7 489 kdltvd^ikkltackfe-lsenl-kpmgqaplnqg 

SEQ ID NO: 8 190 phvdle-ti~ kakteadfivad- dfkemqisqkglel 
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Figure 10 



GTGAAAACTGTGTATACTCTCKSGAATCGACGTTGGTTCTTCTTCTTCCAAGGCftGTCATC 
CTGGAAGATGGCAAGAAGATCGTCGCCCATGCCGTTC^ 

GGTCCGGAACGCGTCCTGGACGAAGTCrTCAAAGATACCAACTTAAAAATTGAAGACATG 
GCGAACATCATGGCCACAGGCTATGGCCGTTTCAATGTCGACTGCGCCAAAGGCGAAGTC 
AGCGAAATCACGTGCCATGCCAAAGGGGCCCTCTTTGAATGCCCCGGTACGACGACCATC 
CTCGATATCGGCGGTCAGGACGTCAAGTCCATCAAATTGAATGGCCAGGGCCTGGTCATG 
CAGTTTGCCATGAACGACAAATGCGCCGCTGGTACGGGCCGTTTCCTCGACGTCATGTCG 
AAGGTACTGGAAATCCCCATGTCTGAAATGGGGGACTGGTACTTCAAATCGAAGCATCCC 
GCTGCCGTCAGCAGTACCTGCACGGTTTTTGCTGAATCGGAAGTCATTTGCCTTCTTTGC 

aagaatgtcccgaaagaagatatcgtagccggtgtccatcagtccatcgccgccaaagcc 
tgcgctctcgtgcgccgcgtcggtgtcggtgaagaot 

CGCGATCCCGGCGTCGTCGATGCCGTATCGAAAGAATTAGGTATTCCTGTCAGAGTGGCT 
CTCC^TCCCCAAGCGGTGGGTGCTCTCGGAGCTGCTTTGATTGCTTATGATT^AAATCAAG 
AAATAA (SEQ ID N0:9) 
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Figure 11 

VKTVYTLGI DV€S SS SKAVILEDGKKIVAHA\n^I€TGSTGPERVLDEVreDTNLKIEDM 
ANI IATGYGRFNVDCAKGEVSEITCHAKGALFECPGTTT ILDIGGQDVKSIKLNGQGLVM 
QFAMN DKCAAGTGRFLDVMSKVLEI PMSEMGDWY FKSKHPAAVS STCTVFAESEVI SLLS 
KNVPKEDI VAGVHQS I AAKACALVRRVGVGEDLTMTGGGSRDPGVVDAVSKELGI PVRVA 
LHPQAVGALGAALIAYDKIKK (SEQ ID NO: 10) 
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Figure 12 

SEQ ZD NO: 9 1 gtgaaaactgtgtatactctcggaatcgacgttggttcttcttcttccaa 

SEQ ID NO: IX 1 - — atgagtatctataccttgggaatcgatgttggatctactgcatfccaa 

SEQ ID NO: 12 1 gtggcagtggcatattcgattggcattgattccggctcaaccgccaccaa 

SEQ ID NO: 13 1 — — atgattttagggatagatgttggatctacaacaacgaa 

SEQ ID NO: 9 SI ggcagtcatcctggaagatggcaagaagatcgtcgc-ccatgccgtcgtt 

SEQ ID NO: 11 48 gtgcattatcctgaaagatggaaaagaaatcgtggc-gaaatccctggta 

SEQ ID NO: 12 51 agggatcttactggcagacggcgtgatta cgcgccgtttcctcgtt 

SEQ ID NO: 13 39 gatggttctaatggaagatagc aagataatttg-gtataagatagag 



SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO:12 
SEQ ID NO: 13 

SEQ ID NO:9 
SEQ ID NO:ll 
SEQ ID NO: 12 
SEQ ID NO:13 
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SEQ ID NO:12 
SEQ ID NO:13 

SEQ ID NO:9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO; 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO:13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 



100 gaaatcggcaccggttcgaccggtccggaacgcgtcctggacgaagtctt 
97 gccgtggggaccggaacttccggtcccgcacggtctatttcggaagtcct 

97 ccaa -ccccctttcgcccgg-caacagcaattact gaagcctg 

85 gatattgg-agttgtta ttgaggaagatattttattaaaaatggt 

150 caaagatacc-aacttaaaaattgaagacatggcgaacatcatcgc-cac 
147 ggaaaatgcc-cecatgaaaaaagaagacatggcctttaccctggc-tac 
138 ggaa-actct-gcgcgaagggttagagacaacgccgtttctgacgctcac 
129 taaggagattgaacaaaaatatccaatagat — — aaaatcgttgc-aac 



198 aggctatggccgtttcaatgtcg- 



-actgcgccaaaggcgaag 



195 cggctacggacg caat-tcgctggaaggcattgccgacaagcaga— 

186 cggctacgggcggcaactggtgg™ —a ttttgccga taaacagg 
174 tggatatggaaggcataaggtta— — gttttgcagataagatag 

239 tcagcgaaatcacgtgccatgccaaaggggcc ctctttgaatgcccc 

239 t gagcgaactg agct gcca tgcca t gggcgcc agct t t a tc tggccc 

227 taacggaaatctcctgtcacgggctgggcgca cggtttcttgcgcca 

215 ttccagaagtta-ttgcattgggaaaaggagctaactatttctttaacga 

286 ggt acgacga— cca tcctcga tat cggcggtcaggacgt caa-gtcca t 
286 — aacgtccataccgtcatcgatatcggcgggcaggatgtgaa-ggtcat 
274 gcaacgcgcg — cggtaa t cgacat cggtggtcaggacagcaaagtga tt 
2 64 ggcaga tgga gt ta tagaca ttggagggcaagat acaaa-ggtctt 

333 caaattga — atggccagggcctggtcatgcagtttgcc-atgaacgaca 

333 ccatgtgg — aaaacgggaccatgacca atttccag-atgaatgata 

322 cagcttgatgatgacggtaacctg- tgcgatttcctgatgaatgaca 

3 09 aaagattg — a taaaaacggaaaag ttgttga ttttatc-ctat cagata . 

380 aatgcgccgctggtacgggccgtttcctcgacgtcatgtcgaaggtactg 

377 aatgcgctgccgggactggccgtttcctggatgttatggccaatatcctg 

368 aatgcgcggcgggcaccgggcgtttcctggaggtgatc tcgcgcacgctt 

35 6 aa tg tgccgc tggaactggaaaat tcttaga— — — aaaggcatta 

430 gaaa tcccca tgtct-ga — aa tgggggactggtactt-caaatcgaagc 

427 gaagtgaaggtttcc-ga— cctggctgagctgggagc-caaatccacca 

418 ggca — ccagcgtcgagc — aactcgacagcatt accg-aaaa t gtc 

397 gatattttaaaaatt-gataaaaatgagataaataaatacaaatcagata 

476 atcccgct-gccgtcagcagtacctgcacggtttttgctgaatcggaagt 
473 a acgggtg -gc t a tcagctccacc tg tact gtgtttgcagaaag tgaagt 
460 acgccgcacgccatcacgagtatgtgcacagtgtttgctgaatcagaagc 
446 a ta tcgct-aaaa ta tct t caa tgtgtgctgtctttgc t gaaag tgaga t 
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SEQ ID NO: 9 
SEQ ID NOrll 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 

SEQ ID NO: 11 

SEQ ID NO: 12 

SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 

SEQ ID NO: 9 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 



525 catttcccttctttccaagaatgtcccgaaagaa — gatatcgtagccgg 
522 catcagccagctgtccaa— aggaaccgacaagatcgacatcattgccgg 
510 gatcagcctgcgctcagcgggcgtcgcgccagaa — gcga 1 1 ct cgcagg 
495 aataagcttactatcaaaaaaagttccaaaggaa — ggca t tttaatggg 

573 tgtccatcagtccatcgccgccaaagcctgcgctctcgtgc-gccgcgtc 
57 0 gatcca tcgttctgtagccagccgggtcat tggt cttgcca-atcgggtg 
556 agtgattaacgcgat-ggcgcggaggagtgc-caatttcat-tgctcgtc 
S4 3 cgtctatgagagtat aa t aaa tagggtta tcccaa tgaccaata 

622 ggtgtcgg — tgaagacctgaccatgaccggcggtggctcccgcgat— c 
619 gggattgt — gaaagacgtggtcatgaccggcggtgtagcccagaac — t 
605 tctc-ctg— tgaagcgccgattctgtttactggtggcgttagtcattgc 
587 ggcttaaaattcaaaacatagtgttttagtggaggagttgctaaaaat— a 



668 ccggcgtcgtcgatgccgtatcgaaagaat- 

665 atggcgtgagaggagccct— — — ggaag 

652 cagaagt- 



-taggtattcctgtc 



— aaggccttggcgtg 
-ttgcccggatgctggaatctcacctgcgaatgccggta 
635 aggttttggttgagatgtttgagaaaaaat tgaataaaaaacta 



712 agagtcgctctgcatccccaagcggtg — ggtgctctcggagctgc 

703 gaaatcaagacgtctcccctggctcagtacaacggtgccctgggtgccgc 

697 aatacccatcctgatgcgcaatttgct ggcgcaattggcgcggc 

679 ctaattccaaaagaaccacagattgtt--—— tgctgtgttggagctat 

756 tttgattgctta -tgataaaatcaagaaa-taa 

753 tctgtatgcgta — t-aaaaaagcagccaaataa 

7jU- ggtaattggtcaa cgagtgaggacacgccgat ga — 

723 attggtt taa 
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Figure 13 

SEQ ID NO: 10 1 vktvytlgidvgsssstevtledgkkivahaweigtgstgpervldevf 

SEQ 10 NO: 14 1 ms-iytlgidvgstaskciilkdgkeivakslvavgtgtsgparsisevl 

SEQ ID N0:15 1 mavaysigidsgstatkgilladg-vltrrflvpt pfrpataiteaw 

SEQ ID NO: 16 1 ndlgidvgstttkmylmeds-kiiwykiedigv--«vieedillkniv 

SEQ ID NO: 10 51 kdtnlkiedmanilatgygffnvd-cakgevseitchakgalfecpgttt 

SEQ ID NO: 14 50 enahmkkedmaftlatgygrnBlegiadk^nselschamgasfiwpnvht 

SEQ ID NO: 15 47 etlreglettpfltltgygrqlvd-fadkqvteischglgarflapatra 

SEQ ID NO: 16 44 keieqkyp-idkivatgygrhkve-fadkivpevialgkganyffneadg 

SEQ v IO NO: 10 100 ildiggqdvksiklngqglvmqfanndkcaagtgrfldvmskvleipmse 

SEQ ID NO: 14 100 vidiggqdvkvihve-ngtmtnfqmndkcaagtgrfldvmanilevkvsd 

SEQ ID NO: 15 96 vidiggqdskviqldddgnlcdflnndkcaagtgrf levisrtlgteveq 

SEQ ID NO: 1-6 92 vidiggqdtkvlkidkngkwdfilsdkcaagtgkflekaldilkidkne 

SEQ ID NO: 10 150 mgdwyfkakhpaavsstctvfaeseviallsknvpkedivagvhqsiaak 

SEQ ID NO: 14 149 laelgakstkrvaisstctvfa'esevisqlskgtdkidliagiljravasr 

SEQ ID NO: 15 146 l-daitenvtphaltsnctvfaesealslcsagvapeailagvinaniarr 

SEQ ID NO: 16 142 ink--yksdniakissmcavf aeseiisllskkvpkegilmgvyeatixur 

SEQ ID NO: 10 200 acalvrrvgvgedltaotgggsrdpgvvdavskelgipvrvalhpgavgal 

SEQ ID NO: 14 199 viglanrvglvkdvTnntggvaqnygvrgaleeglgvelktsplaqyngal 

SEQ ID NO: 15 195 sanf iarlsceapilftggv^hcqkfarmleshlrmpvnthpdaqf agal 

SEQ ID NO: 16 190 vipmtnrlki-qpiivfsggvaknkvlvemfekklnkkllipkepqlvccv 

SEQ ID NO: 10 2S0 gaaliaydkikk— 

SEQ ID NO: 14 249 gaalyaykkaak— 

SEQ ID NO: 15 245 gaavig-qrvrtrr 

SEQ ID NO: 16 239 gailv • 
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Figure 14 

AT€A€TGAAGAAAAAACAGTAGATATTt3AAAGCATGA€CTCCAAGGAAGCCCTTGGTTAC 
TTCTTGCCGAAAGTCGATGAAGACGCACGTAAAGCGAAAAAAGAAGGCCGCCTCGTTTGC 
TGGTCXGCTTCTGTCGCTCCTCCGGAATTCTGCACGGCTATGGACATCGCCATCGTCTAT 
CCGGAAACTCACGCAGCTGGTATCGGTGCCCGTCACGGTGCTCCGGCCATGCTCGAAGTT 
GCTGAAAACAAAGGTTACAACCAGGACATCTGTTCCTACTGGCGCGTCAACATGGGCTAC 
ATGGAACTCCTCAAACAGCAGGCTCTGACAGGCGAAACGCCX3GAAGTCCTCAAAAACTCC 
CCGGCTTCTCCGATTCGCCTTCCGGATGTTGTCCTCACTTGCAACAACATCTGCAATACC 

ttgctcaaatggtatgAaaacttggct^ 

gtaccgttcaaccatgaattccctgttacgaaacacgc^aaacagtacatcgtcggcgaa 
ttcaaacatgctatcaaacagctcgaagacctttgcggccgtcccttcgactatgacaaa 
ttcttcgaagtacagaaacagacacagcgctccatcgctgoctggaacaaaatcgctacg 

TACTTCCAGTACAAACCGTCGCCGCTCAACGGCTTCGACCTCTTCAACTACATGGGCCTC 
GCCGTTGCTGCCCGCTCCTTGAACTACTCGGAAATCACGTTCAACAAATTCCTCAAAGAA 
TTGGACGAAAAAGTAGCTAATAAGAAATGGGCTTTCGGTGAAAACGAAAAATCCCGTGTT 
ACTTGGGAAGGTATCGCTGTCTGGATCGCTCTCGGCCACACCTTCAAAGAACTCAAAGGT 
CAGGGCGCTCTCATGACTGGTTCCGCTTATCCTGGCATGTGGGACGTTTCCTACGAACCG 
GGCGACCTCGAATCCATGGCAGAAGCTTATTCCCGTACATACATCAACTGCTGCCTCGAA 
CAGCGCGGTGCTGTTCTTGAAAAAGTTGTCCGCGATGGCAAATGCGACGGCTTGATCATG 
CACCAGAACCGTTCCTGCAAGAACATGAGCCTCCTCAACAACGAAGGCGGCCAGCGCATC 
CAGAAGAACCTCGGCGTACCGTACGTCATCTTCGACGGCGACCAGACCGATGCTCGTAAC 
TTCTCGGAAGCACAGTTCGATACCCGCGTAGAAGCTTOGGCAGAAATGATGGCAGACAM 
AAAGCCAATGAAGGAGGAAACCACTAA (SEQ ID NO: 17) 
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Figure IS 

^SEEKTVDIESMSSKEALGYFLPKVDEDARKAKKEGRLVCWSAS 

PETHAAGKARHGAP^LEVAENKGYNQDICSYCRVNMGYMELLKQQALTGETPEVLKNS 
PAS PI PLPDWLTCNNICNTLLKWYENLAKELNVPLINI DVPFNHEFPVTKHAKQYI VGE 
FKHAIKQLEDLCGRPFDYDKFraVQKQTQRSIAAWNKIATYFQYKPSPLKGFDLFNYMGL 
AVAARSLNYSEITraKEXKELDEKVANKKWAFGENEKSRVTWEGIAVWIALGHTFKELKG 
QGAL^m;SAYPGMWDVSYEPGDLESMAEAYSRTYIN^^ 

HQNRSCKNMSLLNNEGGQRIQKNLGVPYVIFIXSDQTDARNFSEAQFDTRVEALAEMMADK 
KANEGGNH (SEQ ID NO: 18) 
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SEQ 


ID NO: 17 


SEQ 


ID NO: 19 
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ID NO:20 


SEQ ID NO: 21 
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ID NO: 17 
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ID NO: 19 


SEQ ID NO? 20 
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ID NO:21 


. s&Q 


ID NO: 17 


SEQ,. ID NO: 19 


SEQ 


ID NO:20 
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ID NO: 21 


SEQ ID NO: 17 
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ID NO: 19 
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ID NO: 20 
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ID NO:21 
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ID NO: 17 
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ID NO: 19 
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ID NO: 20 
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ID NO: 21 
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ID NO: 19 
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ID NO:20 
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ID NO:21 
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ID NO: 17 
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ID NO: 19 
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ID NO: 20 
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ID NO:21 


SEQ 


ID NO: 17 


SEQ 


ID NO: 19 


SEQ 


ID NO: 20 


SEQ 


ID NO: 21 


SEQ 


ID NO: 17 


SEQ ID NO: 19 


SEQ 


ID NO: 20 


SEQ 


ID NO: 21 


SEQ 


ID NO:17 


SEQ 


ID NO: 19 


SEQ 


ID NO: 20 


SEQ 


ID NO:21 



Figure 16 

1 atgagtgaagaaaaaacagtagatattgaaagcatgagctccaaggaagc 

1 atg ccaaagacagta agccctggcgttcagg— — 

X — — atgatgaaattaaag — gcaattgaaaagttga — tgcaa — 

X — atgtcacttgtcaccga- tcta — cccgc 

51 cctt ggttacttcttgccgaaa— gtcgatgaagacgca c 

32 -cat- — tgagagatgtagttgaaaaggtttacagagaactg— — -~c 

37 aaatt cgcca — gtagaaaagaacagc 1 

27 cattttcgatcagttct — ctgaag — ctcgccagacaggctttctcacc 

89 gta-aagegaaaa-aagaaggccgcctcgttt-gctggtccgcttctgtc 
71 ggg-aaccgaaag-aaagaggagaaaaagtag-gctggtcctcttc— ca 

63 atataagcaaaaagaagaaggtagaaaagttt ttggaatgttctgtg 

73 gtc-atggatctc-aaggag — cgcggcattccgctggt tggc 

136 gctcctccggaattctgcacggctatggacatcgccatcgtc— tatccg 
11 6 agttcccctgcgaactggctgaatctt ttcggctgca tgtt gg^ta tccg 

110 cct atgttcca -atagaaat aat — tt — tagcag 

112 act tactgcacctttatg ccgcaagag— — -atccc 

18 4 gaaactca— cgcagctggtatcggtgcc— cgtcacggtg- 

166 gaaaacca — ggctgctggtatcgctgccaaccgtgacggcgaagtgatg 
140 caaatgcaatcccagttggtttgtgtgga— ggtaaaaat- 
144 ga 1— ggcagc cggtgcg gtt- 



221 



■ — - — ■ - , ■ ctccggccatgc 

2r4n^cca^~c^ca~«^ 

!70 -gacacaa 

— gtttcgctctgt 



233 tcgaagt-t- 



— — — gctg — — - - - — — aaaa — 

264 ccgtatt-tccctggcttatgctgccgggttccggggtgccaacaaaatg 
185 tcccaat-a- 
178 tccacctct— 



249 — caaaggttacaaccaggacatctgttcctactgccgcgtcaacatg — 
313 gacaaagatggcaactatgtcatcaacccccacagcggcaaacagatgaa 

198 ggaggat-ttgccaagaaacctatgcc- cattaata 

195 — ca ttgaagaagcggagaaagat ctgccgcg-caacct 



295 



ggctacatggaactc— ctcaaacagcag- 



363 agatgccaatggcaaaaaggtattcgacgcagatggcaaacccgtaatcg 

232 aaatc— atccta— — tg - 1 — ■■ 

23i — - "Ctgcccg— -ctg — attaaa-agca- ~ 



322 
413 
245 
251 



atcccaagaccctgaaaccctttgccaccaccgacaacatctatgaaatc 



322 - — gctctgac aggcgaaa — -cgccggaa-gtcctcaa 

4 63 gctgctctgccggaaggggaagaaaagacccgccgccagaatgccctgca 

245 gttttaa ; — gaa-ggca— aa 

251 gctacggc ttcggcaa aaccg at 
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SEQ ID NO; 17 
SEQ ID NO; 19 
SEQ ID NO: 20 
SEQ ID NO:21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO:20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ i ID NO: 20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO:21 

SEQ ID NO: 17 
SEQ ID MO: 1'9 
SEQ ID NO: 20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO:21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO:21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO: 21 

SEQ ID NO: 17 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO: 21 



354 aaactccccggcttctccgattccccttccggatgttgtcctcacttgca 
513 caaatatcgtcagatgaccatgcccatgccggacttcgtgctgtgctgca 

261 aacctgccc— ttactttgaa.gcatct gatatagttat-tggagaa 

27 4 aaatgcccctacttct acttttcggatctggtggtc ggtg 

404 acaacatctgca ataccttgctcaaatggtatgaaaacttgg- 

563 acaacatctgca actgcatgaccaaatggtatgaagacattg- 

304 actacctgtgaaggaaagaagaagatgtttgagttgatggagagattggt 
314 aaaccacctgcg acggcaaaaagaaaatgtatgaatacatgg- 

446 -ctaaagaattgaac gtacctctca tcaaca t cgacgtac — c 

605 -cccgtcggcacaac attcctttga tcatgatcgacgttc — c 

354 gccaatgcatataat gcacctcccacacatgaaagatgaagatt — c 

356 -c ggagtttaagcctgttcatgtga tgca-attgcccaacagc 



486 gttca — accatgaattc cctg — tta-cgaa ac — acgctaa 

64 5 ttaca--ac gaattcgaccatg — tcaacgaa — — gccaacgtgaa 

399 tttga — a aatct ggat— taa-agaagttgaa — aagctaa 

397 gttaaggacgatgcctcg cgtgcgtta-tgga- 

522 acagtacatcgtcg gcgaat tcaaacatgctatca- 

684 a tacatccggt- cccagctggatacggccatcc gtcaaa 

434 — aagaattggttgagaaagagactggaeataaaataacagaggaaaagt 
429 — agccgagatgct gcgcttgcaa — a aaacgg 

563 tcgaagacctttgcggccgtcccttcgactatgacaaattcttcgaagta 

722 tggaagaaatcaccggcaagaagttcgatgaagacaaattc gaa 

482 taaaaga gacagttgat— aaagta 

458 tagaagaacgttttgggcacgagattagcgaagatgctctgcgcgatgcc 

613 cagaaacagacacagcgctc-catcg— ctgcc- tggaacaaaat 

766 cag-tgctgccagaacgc-c-aaccgtactgccaaagcatggctgaaggt 

505 aataaagttagggag — t tgttttataaa 

508 attgcgctgaaaaaccgcgaacgtcg — cgcac tgg ctaat 

654 cgctacgtact tc— c— agt aca aaccg tcgccgc tcaacggct tcgac 
813 ttgcgactacctg— c— agtacaaaccggctccgttcaacgggt tcgac 
S32 ctctatgaattga— ggaagaataaaccagctccaattaagggtttagat 
547 ttttatcatcttgggc—agttaaatcctccggcgcttagcggcagcgac 

700 ctcttcaactacatgggcctcgccg-ttgctgcccgctccttgaactact 
859 ctgttcaaccatatggctgacgtgg-ttaccgcccgtggccgtgtggaag 
580 gttttaaaattattccagtttgcctatttattggatattgatgacacaat 
595 attctga aagtggtttacggcg-caaccttccggttcgataaagagg 

749 cggaaa tcacgttcaacaaa t tcctca aagaattggacgaaaaagtagc- 

908 ctgctgaagctttcgaactgctggccaaggaactggaacagcatgt- 

630 agggatt ttagaggatttaattgaggagttagaggagagagtt — - 

64 1 eg ttgatcaa tgaactggatgcaatgaccgcc 

798 — t aa taagaaatgggct t tcggtgaa aacgaaaaa tcccg 

954 gaaggaaggcaccaccaccgctcccttcaaagaacagcatcg 

573 aaaaaaggagaaggttatgaaggaa agagaa— ; 

673 cgcgttcgtcagcagtgggaagaag— gec agcgactggacccg 

837 tgttacttgggaaggta-tcgctgtctggatcgctctcggccacacc 

996 tatcatgttcgaaggga-tcccctgctgg — ccgaaactgccgaacc- — 

704 1 1 1 1 aa taac- 1 ggct g tc-caa tgg 1 1 gc t ggaaacaa taag 

715 cgt — ccgcgcattttaatcaccggctg cccgattggcggcgc 
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869 


SEQ 


ID NO: 21 


674 


SEQ 


ID NO: 17 


1033 
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1186 
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913 


SEQ 


ID NO: 21 




SEQ 


ID NO: 17 


1082 


SEQ 


ID NO: 19 


1235 


SEQ 


ID N0:20 


961 



1— tcaaagaactca — aaggtcagggcgctctcatgactggttcc 

tgttcaaaccgctga — aagccaacggcctgaacatcaccggcgtt 

attgt— tgaaattattgaggaagtt ggaggagtagttgttggtgaa 

— agcaga — aaaagtggtgcgcgcgattgaagagaatg 

gcttat™cctggcatgtgggacgtttcctacgaacc— — ~ *ggg- 
gtatatgctcctgctttcgggttcgtgtacaacaacct— — — gga- 
g — aaa — gctgcactggaacaagattctttgaaaactttgttgaggg- 
gc g — gctgggttgtcggttatgaaaactgcacc — — — -gggg 

-cga cctcg-aatccatggcagaa — — gcttattcccgtac 

cga attgg tcaaagcctact' gcaaagccccgaac 

ctatagcgtag-aggacattgcaaaa agata cttta 

827 cgaaagcga ccgagcaatgcgtggcagaaacgggcgatgtctacgac 



atcaactgctgcct- 



-cgaacagcgcggtgct 
— cgaacagggtgttgcc 
acgatgagagagttgaa 
ga t tggctgctcct 
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1366 


SEQ 


ID 


NO:20 
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1098 


SEQ 


ID 


NO: 17 


1250 


SEQ 


ID 


NO: 19 


1403 


SEQ 


ID 


NO: 20 


1118 


SEQ 


ID 


N0:21 


1146 



-tacac-tttgcagta ttgccat acatttaacatagagggagc 

959 aggaatatcaggtcgatggcgtagttga — tgtgattttgcaggcgt 



gccatacctacgcggtggaatcgc— tggcgattaaacgtcatgtgc 

gacggcgaccagaccgatgctcgtaacttctcggaagca- 
gacggtgaccaggctgacccgagaaacttcaacgcggct- 

■aagtgatag — agag- 



104 9 gccagc-agcacaaca t tcct ta tatcgctattgaaacagactactccac 



agttcgatacccgcgtagaagctttggcagaaatga 
-cagtatgagacccgtgttcagggcttggtcgaagcca 



cagttaaaaacaaggttggaggcatttattgagatga 

1098 ctcggatgtcgggcagctcag tacccgtgtcgcggcctttattgagatgc 



tggaag-caaatgatgaaaagaagg-ggaaataa- 
t ttaa 
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Figure 17 

1 — mseektvdiesmsskealgyflpkvdedarkaJckegrlvcwsasvapp' 

1 — — mpktvs pgvqalrdwekvyrelr epker ge Jcvgws ss kfpc 

1 — ramklka— ieklmqkfa— srkeqlykqkeegrkvfgm 

1 mslvtdlpaifdqfsearqtg-fltvmdlkergiplvg— 

49 efctamdiaivypethaag igarhgapamlevaenkgynqdlcsycr 

43 elaesfrlhvgypenqaag iaanrdgevmcqaaedigydndicgyar 

35 -fcayvpieiila-anaip vglcggkndtipiae-edlpmlcplik 

36 tyctfmpqei pmaagavvvslcstsdetieeae-kdlprnlcplik 



90 islayaagfrgankndkdgnyvinphsgkcpakdangkkvfdadgkpvidp 

79 ssygf " 

83 ssygf— — — -— — — —————— -~ 



-lknspaspiplpdwltcnn 



102 ellkqqaltgetpev- 
140 Jctlkpfattdniyeiaalpegeekttrqnalhkyrqmtx^sn^wtvjccnn 

84 kkaktcpyfeasdiviget 

88 gktdkcpyf 3T fadlwg-et 

137 icntilkwyenlakelnvplinidvpf nhefpvtkhakqyivgef kfaalk 
190 Icnc^tkwyediarrhniplimidvpynefdhvneanvkyirsqldtaix 

103 tcegkkkmfelm— erlvpmhimhlphmkd edslkiwikeveklke 

107 tc^gkkkmyeyxnaefkpvhvmqlptisvkdd — - — aaralwkaemlrlqk 

187 qleaicgrpfayakrt^ 

240 qmeeitgkkfdedkfeq ccqnanrtakawlkvcdylqykpapfngfd 

147 lveketgnkiteeklke tvdkvnkvxelfyklyelrknkpapikgld 

152 tveerfgheieedalrdaialknrerralanfyiag qlnppalsgsd 

234 ifnymglavaarslnyseitfnkflkeldekvan— kkwafge«n- 

287 ifnhmadwtargrveaaeafellakeleqhvke — gtttapf — k- 

194 vlklfqfaylldiddtiglle dlieeleerv kk ge — gy 

X99 me wygatfrfdk ealineldamtaxvrqqweegqrld- 

276 ^srytwegiavwialghtfkelkgqgalmtg -aay pgmwdvsy 

329 eqhrimf egipcwpklpivlf kplkanglnitg -wy apafgfvy 

231 egkrilitgcpmvagnnkiveiieevggvwg -eesctgtrff enfv 

238 prprilitgcpiggaaekvvraieenggwvvgyenctga kateqcva 

319 epgdl-esmaeaysrtyinccl — eqrgavlekwrdgkcdglimhqnra 
372 — nnl-delvkayckapnsvai — eqgvawreglirdnkvdgvlvhynra 

277 egysv-ediakryfUpcacrfkndervenlkrlvkeldvd gvvyyt lqy 
285 etgdvydaladkylaigcscvspndqrlknasqpnveeyqvayvvdvilqa 

366 cknmsllnnegg--qriqknlgvpyvifdgdqtdarnfseaqfdtrveal 
417 C kpwsgyinpem^--rrftkdmgiptag£dgdqadprnfnaaqyetrvqgl 

326 cht fnlegakveealkeegipiirietdyses— dreqlktrleaf 

335 chtyaveslaik— rnvrqqhnipyiai etdystsdvgqlstrvaaf 

414 aemmadkkaneggnb 
465 veameandekkgk — 

370 iemi 

380 
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Figure 18 



ATCAGTCAGATCGACGAACTTATCAGCAAATTACAGGAAGTATCCAACCATCCCCAGAftG 
ACGGTTTTGAATTATAAAAAACAGGGTAAAGGCCTCGTAGGCATGATGCCCTACTACGCT 
CCGGAAGAAATCGTATATGCTGCAGGCTACCTCCCGGTAGGCATGTTCGGTTCCCAGAAC 
CCGCAGATCTCCGCAGCTCGTACGTACCTTCCTCCGTTCGCTTGCTCCTTGATGCAGGCT 
GACATGGAACTCCAGCTCAACGGCACCTATGACTGGCTCGAGGCTGTTATCTTCTCCGTT 
CCTTGCGACACTCTCCGCTGCATGAGCCAGAAATGGCACGGCAAAGCTCCGGTCATCGTC 

ttcacacagccgcagaaccgtaagatccgcgcggctgtcgatttcctcaaagctgaatac 
gaacatgtccgtacggaattgggac<;tatcctcaacgtaaaaatctccgacctggctatc 
caggaagctatcaaagtatataacgaaaaccgtcaggttatgcgtgaattctgcgacgta 

GCTGCTCAGTACCCGCAGATCTTCACTCCGATAAAACGTCATGACGTGATGAAAGCCCTC 

TGGTTCATGGACAAAGCTGAACACAGCGCTTTGGTCCGCGAACTCATCGACGCTGTCAAG 

AAAGAACCGGTACAGCCGTGGAATGGCAAAAAAGTCATCCTCTCCGGTATCATGGCAGAA' 

CCGGATGAATTCCTCGATATCTTCAGCGAATTCAACATCGCTGTCGTCGCTGACGACCTC 

GCTCAGGAATCCCGCCAGTTCCGTACAGACGTACCGTCCGGCATCGATCCCCTCGAACAG 

CTCGCTCAGCAGTGGCAGGACTTCGATGGCTGCCCGCTCGCTTTGAACGAAGACAAAGCG 

CGTGGCCAGATGCTCATCGACATGACTAAGAAATACAATGCTGACGCCGTCGTCATCTGC 

ATGATGCGTTTCTGCGATGCTGAAGAATTCGACTATCCGATTTACAAACCGGAATTTGAA 

GCTGCTGGCGTTCGTTACACGGTCCTCGACCTCGACATCGAATCTCCGTCCCTCGAACAG 

CTCCGCACCCGTATCCAGGCTTTCTCGGAAATCCTCTAA (SEQ ID NO:25) 
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Figure 19 

msqidelisklqevsnhpqktvlnykkqgkglvgmmpyyapeeivyaagylpvgmfgsqn 
p<ji s aart ylp pfacslmq admelqlngt ydcldavi fs vpcdtlrcmsqkwhgkapvi v 
ftqpqnrkirpavdflkaeyehvrtelgrilnvkisdlaiqe^ 
aaqypqiftpikrhdvikarw™dkaehtalvrelidavkkepvqpwngkkvilsgimae 
pdefldifsefniawaddlaqesrqfrtdvpsgidpleqlaqqwqdfdgcplalkedkp 
rgqmli dmtkkyn adawicmmrfc dpeefdypi ykpefeaagvrytvldldies ps leq 
lrtriqafsexl (seq id no: 26) 
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Figure 20 

SEQ ID NO: 25 1 atgagtcagatcgacgaacttatcagcaaattacaggaagtatccaacca 

SEQ ID NO: 27 1 atggct — -atcagtgcacttattgaagagttccaaaaagtat-ctgcca 

SEQ ID NO: 28 1 atgatgaaattaaaggcaattgaaaagttgatgcaaaaat 

SEQ ID NO: 29 1 atgtcacttgtcaccgatctacccgccattttcgatcagttctctgaagc 

SEQ ID NO: 25 51 tccccagaag ac ggttttg— aattataaaaaa 

SEQ ID NO: 27 47 gccc — gaag ac— catgctggccaaatataaagcc 

SEQ ID NO: 28 41 tcgccagtag aaaagaacagctatat aagcaaaaagaa 

SEQ ID NO: 29 51 tcgccagacaggctttctcac cgtcatg gatctcaaggag 

SEQ ID NO: 25 82 cagggtaaaggcctcgtaggca— tgatgccctactacgctccggaagaa 
SEQ 'ID NO:27 79 cagggcaaaaaagccatcggct— gcctgccgtactatgttccggaagaa 
SEQ ID NO:28 79 gaaggtagaaaagtttttggaa— tgttctgtgcctatgttccaatagaa 
SEQ ID NO:29 91 cgcggcattccgctggttggcacttactgcacctttatgc — cgcaagag 

SEQ ID NO: 25 130 atcgtatatgctgcaggctacctcccggtaggcatgt tcggttccca 

SEQ ID NO: 27 127 ctggtctatgctgcaggcatggttcccatgggtgtat ggggctgcaa 

SEQ ID NO: 28 127 ataattttagcagcaaatgcaatcccagttggtttgt gtggaggtaa 

SEQ ID N0:29 139 atcccgatggcagccgg tgcggttgtggtttcgctctgttccac 

SEQ ID NO: 25 177 gaacccgcag-atctccgcagctcgtacgtaccttcctccgtt 

SEQ ID NO: 27 174 tggcaaacaggaagtccgttccaaggaa-tactgtgcttcctt 

SEQ ID NO: 28 174 — — — aaatgacaca-atcccaatagcagaggaggatttgccaagaaa 

SEQ ID NO:29 183 ctctgatgaaacc attgaagaagcggagaaagatctgccgcgcaa 

SEQ ID NO: 25 219 cgcttgctccttgatgcaggctgacatggaactccagctcaacggca 

SEQ ID NO: 27 216 ctactgcaccattgcccagcagtctctggaaatgctgctggacggga~- 

SEQ ID N0:28 216 cctatgcccattaataaaatcatcctatggttttaag aaggca 

SEQ ID NO:29 228 cctctgcccgctga -ttaaaagcagctacggct — tcggcaaaa 

SEQ ID NO: 25 266 cctatgactgcctcgacgctgttatcttctcc gttcct-tgcg 

SEQ ID NO: 27 2*3 ccctggatgggttggacgggatcatca-ctcc ggtactgtgtg 

SEQ ID NO: 28 259 — aaaacctgcccttactttg-aegcatctgatatagttatt-ggag 

SEQ ID NO:29 269 ccgataaatgcccctac ttctacttttc -ggatct-ggtggtc 

SEQ ID NO: 25 308 acactctccgctgcatgagccagaaat gg -c- 

SEQ ID NO: 27 305 ataccctgcgtcccatgagccagaacttcaaagtgg 

SEQ ID NO: 26 302 aaact acctgtgaa — gg 



ID NO:29 310 ggtgaaaccacctgcgacggcaaaaagaaaa- — tgtatgaatac- 

SEQ ID NO: 25 338 — — acggcaaagct ccggtcatcg-tcttcacacagccgcagaac 

SEQ ID NO:27 343 atgaaagacaagatg— — ccggttattt-tcctggctcatccccaggtc 

SEQ ID NO: 28 319 aagaagaagat gtttgagttgatggagagattggtgccaatg 

SEQ ID NO: 29 352 atggcggagtttaagcctgttcatg-tgatgcaattgcccaacagc 

SEQ ID NO: 25 379 cgtaaga-tccgcccggc tgtcgatttcctcaaag-ct 

SEQ ID NO: 27 388 cgtcagaatgccgccggc aagc-agttcacctatg-at 

SEQ ID NO: 26 361 catataa-tgcacctcccacacatgaaagatgaagattctttgaaaatct 

SEQ ID NO: 29 397 gttaagg-acgatgcctc — gcgtgcgttatggaaag-cc 

SEQ ID NO: 25 415 gaat — acgaacatgtc cgt acgg — aattgg — — gacg 

SEQ ID NO: 27 424 gcct — acagcgaagt — ga aaggccatctgg aaga 

SEQ ID NO:28 410 ggattaaagaagttgaaaagcta aaag — aattggttgagaaa 

SEQ ID NO: 29 433 ga- gatgctgcg cttgcaaaaaacgg— tagaag aacg 
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SEQ ID NO: 25 
SEQ ID NO:27 
SEQ ID NO: 28 
SEQ ID NO:29 
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SEQ ID NO:25 
SEQ ID NO: 27 
SEQ ID NO: 28 
SEQ ID NO: 29 

SEQ ID NO: 25 
SEQ ID NO: 27 
SEQ ID NO: 26 
SEQ ■ ID NO: 29 

SEQ ID NO: 25 
SEQ ID NO: 27 
SEQ ID NO: 28 
SEQ ID NO:29 



4 47 tatcctcaacgtaaaa — atctccgacctggctatccaggaagctatcaa 
456 aatctgcggcca tgaa — atcaccaatgatgccatcctggatgccatcaia 
451 gagactggaaataaaataacagaggaaaagttaaaagagacagttgataa 
4 68 ttttgggcacg ag — att agcgaaga tgctct gcgcga tgccat tgc 

4 95 agtata taacgaaaaccgtcaggttatgcgtgaattct- 
504 agtgtacaacaagagccgtgctgcccgccgcgaattct 

501 agtaaataaagtta gggagttgttttataaactct atg 

513 gctgaaaaaccgcgaacgtcgcgcactggctaatttttatcatcttgggc 

536 acgtagctgctcag tacccgcagatcttcactccgataaa — acg 

545 aactggc— caacg aacatcctgatctgatcccggcttccgtacg 

53 9 a-attgaggaagaa taaac-cag ctccaattaa — ggg 

563 agttaaatcctccggcgcttagcggcag— cgacattctgaaagt — ggt 

— ttca 
— ttca 



-aaag cccgctgg 

-cgtg ccgcttac — 



579 tcatgacgtca< 
588 ggccaccgtactg 

57 3 tttagatgtttta aaattattccagtttgcctatttat 

609 ttacggcgcaaccttccggttcgataaag—- aggcgttg atca 

608 tggacaaagctgaacacaccgctttggtccgcgaactcatcgacgctgtc 

617 .tgctgaaggatgaatacaccgaaaagctggaagaact.gaacaagg--^ 

611 tggatattgatgacacaatagggattttagaggatttaattgaggagtta 
650 atgaactggatgcaatgaccgc ccgcg — ttcgtcagcagtggg 



SEQ ID NO: 25 658 aagaa 

SEQ ID NO: 27 — 662 aactg 

SEQ ID NO: 28 661 gaggagagagttaa— aaaaggagaaggttatgaa 

SEQ ID NO: 29 692 aagaa ggccagcgactggacccgr-^— 



•ag — aaccggtacagccgtggaat ggcaaaaaa 

•gc— agctgetcetgccggcaagfetGgacggccacaaa 

ggaaagaga 

itttta 



SEQ ID NO:25 
SEQ ID NO:27 
SEQ ID NO: 28 
SEQ ZD NO: 29 



694 gtcatcctctccggt atcatggcagaaccggatgaattcct 

703 gtggttgtttccggc — r atcatctacaacacgcccggcatcct 

703 attttaataactggctgtccaatggttgctggaaacaataagattgt-' — 
730 atcaccggctgcccg attggcggcgcagcagaaaaagtggtgcg 



SEQ ID NO: 25 735 cgatatcttcagcgaatt-caacatcgctgtcgtcgctgacgacctc-gc 

SEQ ID NO:27 744 gaaagccatggatgacaa-caaactggccattgctgctgatgactgc-gc 

"SEQ" ID NO: 28 =75X) -^^alftta«:gaggaagt-tggaggagtagttgttggtgaagaaagctgc 

SEQ ID NO: 29 774 cgcgat-tgaagagaatggcggctgggttgtcggttatgaaaactgc-ac 



SEQ ID NO:25 
SEQ ID NO:27 
SEQ ID NO:28 
..SEQ IDJWQ:29 

SEQ ID NO: 25 
SEQ ID NO:27 
SEQ ID NO:28 
SEQ ID NO:29 

SEQ ID NO:25 
• SEQ ID NO: 27 
SEQ ID NO:28 
SEQ ID NO:29 

SEQ ID NO:25 
SEQ ID NO:27 
SEQ ID NO:28 
SEQ ID NO:29 



783 tcagga-atcccgccagttccgtacagacgtaccgtccggcatcgatccc 

7 92 tta tga-aagccgcagctttgccgtggatgctccggaagatctgga c 

799 actgga-a — caagat tctttgaaaactttgttgagg — gctatagc 

822^oaqqcaa^a^cg accgagcaatgc - gtggca 

832 ctcgaacagctcgctcag cagtgg ~caggapttcgat-g 

638 aacggactgcatgctctggctgtacagttctccaaacagaagaacgat-g 
841 gtagaggacattgcaaa -aaga-tacttt-a 



868 tacgacgcgctggcggat- 



-aaatat ctgg cgattg 



869 gctgcccgctcgctttgaa cgaagacaaaccgcg-tggccag 

887 ttctgctgtacgatcc — tgaatttgccaagaatacccgttctgaacac 

869 aaatcccatgtgcttgta gatttaaaaacgat-gagagag 

902 gctgctc-ctgtgtttcgc cga— acgatcagcg-cctgaaa 

910 atgctcatcgaca tgactaagaaatacaatgctgacgccgtcgfcc 

934 gttggca ate tggtaaaagaaagcggcgcagaaggactgatc 

908 ttgaaaatataaagagattggttaaagagttggacgtcgatggagttgtt 
940 atgctcagccaga tggtggaggaatatcaggtcgatggcgtagtt 
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SEQ ID NO:25 955 atctgcatgatgcgtttctgcgatcctgaagaattcgactatc cgat 

SEQ ID NO:27 976 gtgttcatgatgcagttctgcgatccggaagaaatggaatatc ctga 

SBQ ID NO: 28 958 tattacactttgcagtattgccatacatttaacatagagggag ctaa 

SEQ ID NO: 29 98S gatgtgattttgcaggcgtgccatacctacgcggtggaatcgctggcgat 

SEQ ID NO: 25 1002 ttacaaaccggaatttgaagctgctgg— — cgttcgttacacggtcctc 

SEQ ID NO: 27 1023 tctgaagaaggctctggatgcccacca— — cattcctcatgtgaagatt 

SEQ ID NO: 28 1005 ggtagaggaggcattaaaagaggaggg — cattc— caattata 

SEQ ID NO: 29 1035 t aaacgtcatgtgcgccagcagcacaacattccttatatcgctatt 

SEQ ID NO:25 1048 gacctcgacatcgaatctccgtccctcgaa- — - --cagctccgcacccg 

SEQ ID NO: 27 1069 ggtgtggaccagatgacccgggactttggt caggcccagaccgc 

• SEQ ID NO: 28 1045 agaattgaaactgactattctgaaagtgatagagagcagttaaaaacaag 

SEQ ID NO: 29 1081 gaaacagactactccacctcggatgtcggg- -cagctcagtacccg 

SEQ ID NO: 25 1092 tatccaggctttctcggaaatcctctaa 

SEQ ID NO: 27 1113 tctggaagctttcgcagaaagcctgtaa 

SEQ ID NO: 28 1095 gttggaggcatttattgagatgatttaa 

SEQ ID NO:29 1125 tgtcgcggcctttattgagatgctgtaa 
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Figure 21 



SEQ ID NO: 26 
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SEQ ID NO: 26 
SEQ ID NO: 30 
SEQ ID NO: 31 
SEQ ID NO: 32 

SEQ ID NO:26 
SEQ ID NO: 30 
SEQ ID NO: 31 
SEQ ID NO: 32 

SEQ ID NO: 26 
' SEQ ID NO: 30 
SEQ ID NO: 31 
SEQ ID NO: 32 



1 msqideliaklqevsnhpqk tvlnykkqgkglvgmmpyyapeeivya 

1 -maisalieefqkvsaspkt mlakykaqgkkaigclpyyvpeelvya 

1 mmkl-kaieklmqkf aarke qlykqkeegrkvf gmf cayvpleiila . 

1 mslvtdlpaifdqfsearqtgf Itvnudlkergiplvgtyctfmpqeipma 

46 agylpvgmfgeqi^qisaartylppfacslioqadmelqlngt- — ydc — 
4 7 agmvpmgvwgcngkqevrakeycasf yctiaqqs lemlldgt ldg — 

47 anaipvglcggkndtipiaeedlprnlcpllkssygfkkaktcpyfea — 
51 agavwslcstsdetieeaekdlprnicplikaa — ygfgkt dkcpy 

93 ldavifsvpcdtlrcmsqkwh gkapvlvftqpqnxkirpavdf 

92 ldgiitpvlcdtlrpmsqnfkvamkdkmpviflahpqvrqnaagkqf 

95 sdivigettcegkkkmfelme rlvpmhimhlp-hmkdedslki 

9 6 f yfsdl wge ttcdgkkkmyeyma efkpvh vmqlpns vkdda s ral 

136 lkaeyehvrtelgrilnvkisdlaiqeaikvynenrqvmrefcdvaaqyp 
139 tydaysevkghleeicgheitndaildaikvynksraarrefcklanehp 

137 wikeveklkelveketgnkiteeklketvdkvnkvrelfyklyelrknkp 
142 wkaemlr lqk t veer f gheis edalrdaial knrerralanf yhlgqlnp 

186 qiftpikrhdvik arwf mdkaehtalvrelidavkk — epvqp 

189 dlipaavratvlr aayf mlkdeytekleelnkelaa — apagk 

187 apikgldvlk- lfqfaylldiddtiglledlieeleervklcgeg 

192 palsgsdilkwygatfr fdk ealinel-damta — r vrqq 

227 vra-gkk — vileg — imaepdef ldifaefniawaddlaqesrqf 

230 fd-ghk — ■ wveg — iiyntpgilkamddnklaiaaddcayesref 

230 ye-gkr illtgcpmvagnnkiveiieevggvwgeesctgtrf f 

230 weegqrldprprilitgcpiggaaekvvraieenggwvvgyenctgakat 

268 rtdvpsgidp-leqlaqqwqdfdgcplalned kprgqmlidmtkkyn 

27 1 avdapedldnglhalavqf skqkndvllydpef akntrsehvgnlvkesg 
273 enfv-egye — vediakryf kip-cacrf load- — e-rvenikrlvkeld 
260 eqcvaetgdv-ydaladkylai-gcscvspnd q-rlkmlsqmveeyq 

314 ada wlcnnnr f cdpeef dypiy kpef -eaa g vr y tv ldldlespa leqlr 
321 aeglivfinmqfcdpeeineypdlkkal-dabhiphvkigvdqmtrdfgqaq 

315 vdgwyytlqychtfniegakvecal-keegipiirictdysesdreqlk 
324 vdgwdvilqachtyavealaikrhvrqq^nipylaietdystsdvgqls 

363 triqafoeil 
370 taleafaesl 

364 trleafleml 
374 trvaafieml 
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Figure 22 

1 CGACGGCCCG GGCTGGTATC ATTCTAGTCA GTAATTCACC TTTGGAAAAT TTTCACAAAG 
61 GCAGTACGAC AGAAGCGTCG ATACATTCCA TTTAGCAGGA GGAAGTTACG GTAATGAGAA 
121 AAGTAGAAAT CATTACAGCT GAACAAGCAG CTCAGCTCGT AAAAGACAAC GACACGATTA 
181 CGTCTATCGG CTTTGTCAGC AGCGCOCATC CGGAAGCACT GACCAAAGCT TTGGAAAAAC 
241 GGTTCCTGGA CACGAACAOC CCGCAGAACT TGACCTACAT CTATGCAGGC TCTCAGGGCA 
301 AACGCGATGG CCGTGCCGCT GAACATCTGG CACACACAGG CCTTTTGAAA CGCGCCATCA 
361 TCGGTCACTG GCAGACTGTA CCGGCTATCG GTAAACTGGC TGTCGAAAAC AAGATTGAAG 
421 CTTACAACTT CTCGCAGGGC ACGTTGGTCC ACTGGTTCCG CGCCTTGGCA GGTCATAAGC 
481 TCGGCGTCTY CACCGACATC GGTCTGGAAA CTTTCCTCGA TCCCCGTCAG CTCGGCGGCA 
541 AGCTCAATGA CGTAACCAAA GAAGAOCTOG TCAAACTGAT CGAAGTCGAT GGTCATGAAC 
601 AGCTTTTCTA CCCGACCTTC CCGGTCAACG TAGCTTTCCT CCGCGGTAOG TATGCTGATG 
6*1 AATCCGGCAA TATCACCATG GACGAAGAAA TCGGGCCTTT CGAAAGCACT TCCGTAGOCC 
721 AGGCCGTTCA CAACTGTGGC GGTAAAGTCG TCGTCCAGGT CAAAGACGTC GTCGCTGACG 
781 GCAGCCTCGA {XXGCGCATG GTCAAGATOC CTGGCATCTA TGTCGACTAC GTCGTCGTAG 
841 CAGCTCCGGA AGACCATCAG CAGACGTATG ACTGCGAATA CGATCCGTCC CTCAGCGGTG 
901 AACATCGTGC TCCTGAAGGC GCTACCGATG CAGCTCTCOC CATGAGCGCT AAGAAAATCA 
961 TCGGCCGCCG CGGCGCTTTG GAATTGACTG AAAACGCTGT CGTCAACCTC GGCGTCGGTG 
1021 CTCCGGAATA CGTTGCTTCT GTTGCCGGTG AAGAAGGTAT CGCCGATACC ATTAC CCTGA 
1081 CCGTCGAAGG TGGCGCCATC GGTGGCGTAC CGCAGGGCGG TGCCCGCTTC GGTTCGTOOC 
1141 GCAATGCCGA TGCCATCATC GACCACACCT ATCAGTTCGA CTTCTACGAT GGCGGCGGTC 
1201 TGGACATCGC TTACCTCGGC- CTGGOCCAGT GCGATGGCTCL-GGGCAACATC AACGTC AGCA 
1261 AGTTCGGTAC TAACGTTGCC GGCTGCGGCG GTTTCCCCAA CATTTCCCAG CAGAC ACCG A 
1321 ATGTTTACTT CTGCGGCAOC TTCACGGCTG GCGGCTTGAA AATCGCTGTC GAAGAOGGCA 
1381 AAGTCAAGAT CCTCCAGGAA GGGAAAGOCA AGAAGTTCAT CAAAGCTGTC GACCAGATCA 
1441 CTTTCAACGG TTCCTATGCA GCCCGCAACG GCABACACGT TCTCTACATC ACAGAAOGCT 
-1501 GCGTATTTGA ACTGACCAAA GAAGGCTTGA AACTCATCGA- AGTCGCACCG GGCATCGATA 
1561 TTGAAAAAGA TATCCTCGCT CACATGGACT TCAAGCCGAT CATTGATAAT CCGA AACTC A 
1621 TGGATGCCCG CCTCTTCCAG GACGGTCCCA TGGGACTGAA AAAATAAATC TCTGCTGTAA 
1681 AGGAGACTTT ACTATGAAAC CAATGAGACT ACATCACGTA GGCATTGTCC TGCCGACCTT 
1741 AGAAAAAGCC CATGAATTCA TGCAGAATAA TGGACTTGAA ATCGACTATG CCGGCTATGT 
1801 CGATGCTTAC CAGGCTGATC TCATTTTCAC TAAGTTTGGT GAATTTGCCA GCCCGATTGA 
1861 AATGATTATC CCGCACTCOG GTGTGCTTAC CCAATTCAAT GGTGGCCGCG GCGGCATTGC 
1921 CCACATCGCC TTCGAAGTGG ACGATGTCGA AGCTGTCCGC CAGGAAATGG AAGCAGATTG 
1981 TCCGGGATGC ATGTTAGAAA AGAAAGCTGT CCAGGGTACG GACGACATTA TCGTCAACTT 
2041 CCGCCGCCCG ACAACCAACC AGGGTATOCT CGTTGAATAT GTTCAGACGA CAGCACCTAT 
2101 CACCGGCCGC GGCGAAAATC CTTTCGTTAA GAATCTCGGC CCGGAAAAAG GGAAGCTCAA 
2161 CGAAACATGG CATCOCATGC GCCTGCACCA TATCGGCATC GTCTTGCCGA CCTTGGAAAA 
2221 GGCCCATGAA TTCATCAAGA CCAATGGTCT GGAAGTGGAT TATTCCGGTT TCGTCGAOGC 
2281 CTACCATGCG GATCTCATTT TCACTAAAAA AGGTGAAAAC AGTACGCCTA TCGAATTCAT 
2341 TATTCCCCGT GAAGGGGTOC TCAAAGATTT CAATCATGGC AGGGGAGGTA TCGCT CATAT 
2401 CGOCTTTGAA GTGGATGATG TCGAAAAGGT ACGTCAGATT ATGGAAAGCC AGAAGCCTGG 
2461 TTGCATGCTC GAAAAGAAAG CCGTOOGGGG AACGGACGAT ATCATOGTCA ACTTCCGOOG 
2521 TCCCAGCACG GACGCCGGCA TCCTCGTCGA ATATGTOCAG ACCGTAGCTC CCATCAATOG 
2581 CAGCAATCGC AAOCCTTTTA ATGATTGATT TTTTATAAAGTXAAGGTGAAA ACTGTGTAffA 
2641 CTCTCGGAAT CGACGTTGGT TCTTCTTCTT CCAAGGCAGT CATCCTGGAA GATGGCAAQl 
2701 AGATCGTCGC CCATGOCGTC GTTGAAATOG GCACCGGTTC GACCGGTOOG GAACGOGTOC 
2761 TGGACGAAGT CTTCAAAGAT ACCAftCTTAA AAATTGAAGA CATGGCGAAC ATCATCGCCA 
2821 CAGGCTATGG CCGTTTCAAT GTCGACTGCG CCAAAGGOGA AGTCAGCGAA ATCACGTGOC 
2881 ATGCCAAAGG GGCOCTCTTT GAATGCCCCG GTACGACGAC CATCCTCGAT ATCGGCGGTC 
2941 AGGACGTCAA GTCCATCAAA TTGAATGGCC AGGGCCTGGT CATGCAGTTT GCCATGAAOG 
3001 ACAAATGCGC CGCTGGTACG GGCCGTTTCC TCGACGTCAT GTCGAAGGTA CTGGAAATOC 
3061 CCATGTCTGA AATGGGGGAC TGGTACTTCA AATCGAAGCA TCCCGCTGCC. GTCAGCAGTA 
3121 CCTGCAOGGT TTTTGCTGAA TCGGAAGTCA TTTCOCTTCT TTCCAAGAAT GTCCCGAAAG 
3181 AAGATATCGT AGCCGGTGTC CATCAGTCCA TCGCCGCCAA AGCCTGOGCT CTCGTGOGCC 
3241 GCGTCGGTGT CGGTGAAGAC CTGftCCATGA CCGGCGGTGG CTCCCGCGAT CCCGGCGTOG 
3301 TCGATGOCGT ATCGAAAGAA TTAGGTATTC CTGTCAGAGT CGCTCTGCAT CCCCAAGOGG 
3361 TGGGTGCTCT CGGAGCTGCT TTGATTGCTT ATGATAAAAT CAAGAAATAA GTCAAAGGAG 
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3421 AG&ACAAAAT CATGAGTGAA -GAAAAAACAG TAGATATTGA AAGCATGAGC TCCAAGGAAG 

3481 CCCTTGGTTA CTTCTTGCCG AAAGTGGATG AAGACGCACG TAAAGCGAAA AAAGAAGGOC 

3S41 GCCTCGTTTG CTGGTCCGCT TCTGTCGCTC CTCCGGAATT CTGCACGGCT ATGGACATCG 

3601 CCATCGTCTA TCCGGAAACT CACGCAGCTG GTATCGGTGC CCGTCACGGT GCTCCGGCCA 

3661 TGCTCGAAGT TGCTGAAAAC AAAGGTTACA ACCAGGACAT CTGTTCCTAC TGCCGCGTCA 

3721 ACATGGGCTA CATGGAACTC CTCAAftCAGC AGGCTCTGAC AGGCGAAACG CCGGAAGTGC 

3781 TCAAAAACTC CCCGGCTTCT CCGATTCCCC TTCCGGATGT TGTCCTGACT TGCAACAACA 

3841 TCTGCAATAC CTTGCTCAAA TGGTATGAAA ACTTGGCTAA AGAATTGAAC GTACCTCTCA 

3901 TCAACATCGA CGTACCGTTC AAOCATGAAT TCCCTGTTAC <5AAACACGCT AAACAGTACA 

3961 TCGTCGGCGA ATTCAAACAT GCTATCAAAC AGCTCGAAGA CCTTTGCGGC GGTCCCTTCG 

4021 ACTATGACAA ATTCTTCGAA GTAGAGAAAC AGACACAGOG CTCCATCGCT GCCTGGAACA 

4081 AAATCGCTAC GTACTTCCAG TACAAACOGT CGCCGCTCAA CGGCTTCGAC CTCTTCAACT 

4141 ACATGGGGCT CGCCGTTGCT t^CCGCTCCT TG AACTACTC GGAAATCACG TTCAACAAAT 

4201 TCCTCAAAGA ATTGGACGAA AAAGTAGCTA ATAAGAAATG GGCTTTCGGT GAAAACGAAA 

4261 AATCCCGTGT TACTTGGGAA GGTATCGCTG TCTGGATCGC TCTCGGOCAC ACCTTC AAAG 

4321 AACTCAAAGG TCAGGGCGCT CTCATGACTG GTTCCGCTTA TCCTGGGATG TGGGAGGTTT 

4381 CCTACGAACC GGGCGACCTC GAATOCATGG CAGAAGCTTA TTCCCGTACA TACATCAACT 

4441 GCTGCCTCGA ACAGCGOGGT GCTGTTCTTG AAAAAGTTGT OCGCGATGGC AAATGCGAOG 

4501 GCTTGATCAT GCACCAGAAC CGTTCCTGCA AGAACATGAG CCTCCTCAAC AACGAAGGCG 

4561 GCCAGCGCAT CCAGAAGAAC CTCGGCGTAC CGTACGTCAT CTTCGACGGC GACCAGACCG 

4*21 ATGCTCGTAA CTTCTCGGAA GCACAGTTCG ATACCCGCGT AGAAGCTTTG GCAGAAATGA 

4681 TGGCAGACAA AAAAGCCAAT GAAGGAGGAA AOCACTAATG AGTCAGATCG ACGAACTTAT 

4741 CAGCAAATTA CAGGAAGTAT CCAAOCATOC CCAGAAGAOG GTTTTGAATT ATAAAAAACA 

4801 GGGTAAAGGC CftGTAGGCA TGATGCCCTA CTACGCTCCG GAAGAAATCG TATATGCTGC 

4861 AGGCTACCTC CCGGTAGGCA TGTTOGGTTC CCAGAACCCG CAGATCTOOG CAGCTCGTAC 

4921 GTACCTTOCT CCGTTCGCTT GCTCCTTCAT GCAGGCTGAC ATGGAACTCC AGCTCAACGG 

4981 CACCTATGAC TGCCTCGACG CTGTTATCTT CTCCSTTCCT TGCGACACTC TCCGCTGCAT 

5041 GAGCCAGAAA TGGCACGGCA AAGCTCCGGT CATCGTCTTC ACACAGCOGC AGAACCGTAA 

5101 GATCCGCOCG GCTGTCGATT TCCTCAAAGC TGAATACGAA CATGTCOGTA CGGAATTGGG 

5161 AOGTATCCTC AACGTAAAAA TCTCCGACCT GGCTATCCAG GAAGCTATCA AAGTATATAA 

5221 CGAAAACCGT CAGGTTATGC <3TGAATTCTG CGACGTAGCT GCTCAGTACC CGCAGATCTT 

5281 CACTCCGATA AAACGTCATG ACGTCATCAA AGCCCGCTGG TTCATGGACA AAGCTGAACA 

5341 CACCGCTTTG GTCCGCGAAC TCATOGACGC TGTCAAGAAA GAACCGGTAC AGCCGTGGAA 

5401 TGGCAAAAAA GTCATCCTCT CCGGTATCAT GGCAGAACOG GATGAATTCC TCGATATCTT 

5461 CAGCGAATTC AACAtCGCTG TCGTCGCTGA CGACCTCGCT CAGGAATCCC GCCAGTTOCG 

5521 TACAGACGTA CCGTCCGGCA TCGATGCCCT CGAACAGCTC GCTCAGCAGT GGCAGGACTT 

5581 CGATGGCTGC CCGCTCGCTT TGAACGAAGA CAAACCGCGT GGCC AGATGC TCATCGACAT 

5641 GACTAAGAAA TACAATGCTG ACGCCGTOGT CATCTGCATG ATGCGTTTCT GCGATCCTGA 

5701 AGAATTCGAC TATCCGATTT ACAAACCGGA ATTTGAAGCT GCTGGCGTTC GTTACA CGGT 

5761 CCTCGACCTC GACATCGAAT CTCCGTOCCT CGAACAGCTC CGCACCCGTA TCCAGGCTTT 

5821 CTCGGAAATC CTCTAAGAAT CGCCTGAATC ATCAAACATC TGGGCGGGAC TCCGAAAGGT 

5881 GCCTGCTAGA TGATACATTG CCTGTTTTCA GGCAGACAGA TTTGCAGCTT GCGGCCCOCA 

5941 TTGTACGGGC TGCAAGCTGT CAATGATGCT TTAAAGACGG CTCTGCCGTT TTTAAATAAA 

6001 AACATAAAAC CATATATAAT CTATTAGGAG GAAACTCAAT CATGGAATTC AAACTTTCTG 

6061 AATTACAGCA AGATATCGCA AATCTCGCAA AAGATTTCGC AGAAAAAAAA TTAGCTCCCA 

6121 CTGTCAAAGA GCGTGACGAA AAAGAAGTTT TCGATCGTGC TATCCTTGAC GAAGTGGGTA 

6181 CTCTCGGOCT TCTCGGTATT CCCTGGGAAG AAGAAAACGG CGGCGTAGGC GCTGACTTOC 

6241 TCAGCCTCGC AGTTGCTTGC GAAGAAGTAG CTAAAGTTAC CAGGCCGGGC CGTCG (SBQ 
ZD NO: 33) 
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Figure 23 



ATGAAACCAATGAtSACTACATCAGGTAGGCATTGTCCTGGCGACCTTAGAAA^GCCCAT 
<3AATTCATGCAGAATAATGGACTTGAAATCGACTATGCCGGCTATGTCGATGCTTACCAG 

gctgatctcattttcactaagtto^ 
cacttcggtgtgcttacccaatt^ 
gaagtggacgatgtosaagctgtccgccaggaaat^ 

ttagaa7vagaaagctgtccagggtacggaiggacattatcgtcaacttccgccgcccgaca 
accaaccagggtatcctcgttgaatatgttcagacgacagcacctatcaccggccgcggc 

GAAAATCCTTTCGTTAAGAATCTCGGCCCGGAAAAAGGGAAGCTCAACGAAACATGGCAT 

CCCATGCGCCTGCACCATATCGGCATCGTCTTGCCGACCTTGGAAAAGGCCCATGAATO 

ATCAAGACCAATGGTCTGGAAGTGGATTATTCCGGTTTCGTCGACGCCTACCATGCGGAT 

CTCATTTTCACTAAAAAAGGTGAAAACAGTACGCCTATCGAATTCATTATTCCCCGTGAA 

GGGGTCCTCAAAGATTTCAATCATGGCAGGGGAGGTATCGCTCATATCGCCTTTGAA<?TC 

GATGATGTCGAAAAGGTACGTCAGATTATGGAAAGCCAGAAGCCTGGTTGCATGCTCGAA 

AAGAAAGCCGTCCGGGGAACGGACGATATCATCGTCAACTTCCGCCGTCCCAGCACGGAC 

GCCGGCATCCTCGTCGAATATGTCCAGACCGTAGCTCCCATCAATCGCAGCAATCCCAAC 

CCTTTTAATGATTGA (SEQ ID NO: 34) 
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Figure 24 

MKPMRLHHVGIVLPTLEKAHE FMQNNGLEI DYAGYVDAYQADLI FTKFGEFAS PIEMI I P 

HSGTOjTQFNGGRGGI AHI AFEVDDVEAVRQEMEADCPGCMLEKKAVQCT DDI I VNFRRPT 

TNQGILVEYVQTTAPITGRGENPFVKNLGPEKGKLNETWHPMRLHHIGIVLPTI^BCIUiEF 

IKTNGLEVDYSGFVDAYHADLIFTKKGENSTPIEFIIPREGVLIUDFNHG 

D DVERVRQI ME S QKPGCMLEKKAVRGT DD I 1 VN FRRP S T DAG ILVEYVQTVAP I NRSN PN 

PFND {SZQ ID NO: 35} 
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Figure 25 

ATGG2U^TTCAAACTTTCTGAATTACAGCAAGATATCGCAAATCTCGCAAAAGATTTCGCA 

gaaaaaaaatt agctcccact gtc aaagagcgt gacg aaaaag aagttttcgatcgtgct 
atc<:ttgacgaagtg<^tactc^ 

<5GCGTAGGCGCTGACTTCCTCAGCCTCGCAGTTGCTTGCGAAGAAGTAGCTAAA<3TTACG 
AGCCCGGGCGGTCG (SEQ ID NO: 36) 
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Figure 26 

MEFIOiSELQQDIANIJUCDFMKKLAPTVraRDEKEVFDRAILDEVGTK 
GVG ADFLS LAVACEEVAKVTS PGR (SEQ ID NO: 37) 
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Figure 27 

1 GTGAGCACAC ACTTGATAGC TGATGCCGTC AATGATCAGT TGTTCGTCTA TAGCAGGCTG 
61 AAAGGACATG GGTTTGGTCA CAGTCTGAGC AGTTGCAGGC AGTCAAACAC GTTCGTAACT 
121 ACGCTGTAGA TGATATAAGC AGTATACCAT CTTGCTACGC TCTCGTTGAT CAGGTTGAAT 
181 GCTTTGAGGA AGGTCAGGCG AATAGCCATG CCTCTTGTTT CCAGAACATG GCATGGGGAT 
241 GGATCGACGG TACCCTGTCG GATGCATGCT ATGCGTGGCA TTCATATCAT CAACCAGAAT 
301 TTGATCTTGA ACTACACAGC AATTCTGCGC GTTATGCAAG TGTCTTCGGT CAGATGGTGA 
361 ACAATTCTCA ATTGTTGAGG TCTTGACGAA TTGCGTTATA CACTGTAGGC TATAGTATGC 
421 ACCCCTTGTT ATCTATATCA CAACCGGTCT ATTAGCATTT GCGTCAAGGA GGATGGTCGA 
481 TGATOGACAC TGCGCCOCTT GCOCCACCAC GGGCGOCCCG CTCTAATCCG ATTCGGGATC 
541 GAGTTGATTG GGAAGCTCAG CGCGCTGCTG CGCTGGCAGA TCCCGGTGOC T7TCATGG06 
-601 CGATTGCCCG GACAGTTATC CACTGGTACG ACCCACAACA CCATTGCTGG ATTCGCTTCA 
661? ACGAGTCTAG TCAGCGTTGG GAAGGGCTGG ATGCCGCTAC CGGTGCCOCT GTAACG CTAG 
721 ACTATCCCGC CGATTATCAG CCCTGGCAAC AGGOGTTTGA TGATAGTGAA GCGCCGTTTT 
781 ACCGCTGGTT TAGTGGTGGG TTGACAAATG CCTGCTTTAA TGAAGTAGAC OGGCATGTCA 
841 TGATGGGCTA TGGCGACGAG GTGGCCTACT ACTTTGAAGG TGACCGCTGG GATAACTOGC 
901 TCAACAATGG TCGTGGTGGT CCGGTTGTOC . AGGAGACAAT CACGCGGCGG CGCCTGTTGG 
961 TGGAGGTGGT GAAGGCTGOG CAGGTGTTGC GTGATCTGGG CCTGAAGAAG GGTGATCGGA 
1021 TTGCTCTGAA TATGCCGAAT ATTATGCCGC AGATTTATTA TACGGAAGOG GCAAAACGAC 
1081 TGGGTATTCT GTACACGCCG GTCTTCGGTG GCTTCTCGGA CAAGACTCTT TCCGACCGTA 
1141 TTCACAATGC CGGTGCACGA GTGGTGATTA CCTCTGATGG TGCGTAOOGC AACGCGCAGG 
1201 TGGTGCCCTA CAAAGAAGCG TATACCGATC AGGCGCTCGA TAAGTATATT CCGGTT GAGA 
1261 CGGCGCAGGC GATTGTTGCG CAGACCCTGG CCACCTTGCC CCTGACTGAG TCGCAGCGCC 
1321 AGACGATCAT CACCGAAGTG GAGGCCGCAC TGGCCGGTGA GATTACGGTT GAGCGCTCGG 
1361 AOGTGATGOG TGGGGTTGGT TCTGOCCTOG CAAAGCTCOG CGATCTTGAT GCAAGCGTGC 
1441 AGGCAAAGGT GCGTACAGTA CTGGCGCAGG CGCTGGTCGA GTCGCOGOOG CGGGTTGAAG 
1501 CTGTGGTGGT TGTGOGTCAT ACCGGTCAGG AGATTTTGTG GAACGAGGGG CGAGATCGCT 
1561 GGAGTCACGA CTTGCTGGAT GCTGCGCTGG CGAAGATTCT GGCCAATGCG CGTGCTGCCC 
1621 GCTTTGATGT GCACAGTGAG AATGATCTGC TCAATCTCCC CGATGACCAG CTTATCCGTG 
1681 CGCTCTACGC CAGTATTCCC TGTGAACCGG TTGATGCTGA ATATCOGATG TTTATCATTT 
1741 ACACATCGGG TAGCACOGGT AAGCCCAAGG GTGTGATOCA OGTTCftOGGC GGTTATGTCG 
1801 CCGGTGTGGT GCACACCTTG CGGGTCAGTT TTGACGCCGA GCCGGGTGAT ACGATATATG 
1861 TGATOGCCGA TCCGGGCTGG ATCACCGGTC AGAGCTATAT GCTCACAGCC ACAATGGCCG 
1921 -GTOGGCTGAC CGGGGTGATT GCCGAGGGAT CACCGCTCTT OOOCTCAGOC GGGCGTTATG 
1981 CCAGCATCAT CGAGCGCTAT GGGGTGCAGA TCTTTAAGGC GGGTGTGACC TTCCTCAAGA 
2041 CAGTGATGTC CAATCCGCAG AATGTTGAAG ATGTGCGACT CTATGATATG CACTCGCTGC 
2101 GGGTTGCAAC CTTCTGCGOC GAGCCGGTCA GTCCGGOGGT GCAGCAGTTT GGTATGCA6A 
2161 TCATGACCCC GCAGTATATC AATTCGTACT GGGCGACOGA GCACGGTGGA ATTGTCTGGA 
2221 CGCATTTCTA CGGTAATCAG GACTTCCCGC TTCGTOOCGA TGCCCATAOC TATCCCTTGC 
2281 CCTGGGTGAT GGGTGATGTC TGGGTGGCCG AAACTGATGA GAGCGGGACG ACGCGCTATC 
2341 GGGTCGCTGA TTTCGATGAG AAGGGCGAGA TTGTGATTAC OGCOOOGTAT CCCTA OCTGA 
2401 CCCGCACACT CTGGGGTGAT GTGCCCGGTT TCGAGGCGTA CCTGOGCGGT GAGATTCOGC 
2461 TGCGGGCCTG GAAGGGTGAT GCCGAGCGTT TCGTCAAGAC CTACTGGCGA CGTGGGCCAA 
2521 ACGGTGAATG GGGCTATATC CAGGGTGATT TTGCCATCAA GTACCCCGAT GGTAGCTTCA 
2581 CGCTCCAOGG ACGCCCTGAC GATGTGATCA ATGTGTCGGG CCACCGTATG GGCACCGAGG 
2641 AGATTGAGGG TGCCATTTTG CGTGACOGCC AGATCACGOC CGACTOGCCC GTCGGTAATT 
2701 GTATTGTGGT CGGTGCGCCG CACCGTGAGA AGGGTCTGAC OOOGGTTGCC TTCATTCAAC 
2761 CTGCGCCTGG CCGTCATCTG ACCGGCGCCG ACCGGCGCOG TCTCGATGAG CTGGTGOGTA 
2821 CCGAGAAGGG GGCGGTCAGT GTCCCAGAGG ATTACATCGA GGTCAGTGCC TTTCCCGAAA 
2881 CCCGCAGCGG GAAGTATATG CGGCGCTTTT TGCGCAATAT GATGCTOGAT GAACCACTGG 
2941 GTGATACGAC GAOGTTGOGC AATOCTGAAG TGCTCGAAGA GATTGCAGCC AAGATCGCTG 
3001 AGTGGAAACG CCGTCAGCGT ATGGCCGAAG AGCAGCAGAT CATCGAACGC TATCGCTACT 
3061 TCCGGATCGA GTATCACCCA CCAACGGCCA GTGCGGGTAA ACTCGCGGTA .GTGACGGTGA 
3121 CAAATCOGOC GGTGAACGCA CTGAATGAGC GTGCGCTCGA TGAGTTGAAC ACAATTGTTG 
3181 ACCACCTGGC CCGTCGTCAG GATGTTGCCG CAATTGTCTT CACCGGACAG GGCGCCAGGA 
3241 GTTTTGTCGC CGGCGCTGAT ATTCGCCAGT TGCTCGAAGA GATTCATACG GTTGAAGAGG 
3301 CAATGGCCCT GCCGAATAAC GCCCATCTTG CTTTCCGCAA GATTGAGOGT ATGAATAAGC 
3361 CGTGTATCGC GGCGATCAAC GGTGTGGCGC TCGGTGGTGG TCTGGAATTC GCCATGGOCT 
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3421 t^CCATTACCG GGTTGCCGAT <3TCTATGCCG AATTCGGTCA GCCAGAGATT AATCTGCGCT 
3481 TGCTACCTGG TTATGGTGGC ACGCAGCGCT TGCCGCGCCT GTTGTACAAG CGCAACAACG 
3541 GCACCGGTCT GCTCCGAGCG CTGGAGATGA TTCTGG6TGG GCGTAGCGTA CCGGCTGATG 
3601 AGGCGCTGAA GCTGGGTCTG ATCGATGCCA TTGCTACCGG CGATCAGGAC . TCACTGTCGC 
3661 TGGCATGCGC GTTAGCCCGT GCCGCAATCG GCGCCGATGG TCAGTTGATC GAGTCGGCTG 
3721 CGGTGACCCA tSGCTTTCCGC CATCGCCACG AGCAGCTTGR CGAGTGGCGC AAACCAGACC 
3781 CGGGCTTTGC CGATGACGAA CTGCGCTCGA TTATCGCCCA TCCACGTATC GAGCGGATTA 
3841 TCCGGCAGGC CCATACCGTT GGGCGCGATG CGGCAGTGCA TCGGGCACTG GATGCAATCC 
3901 GCTATGGCAT TATCCACGGC TTCGAGGCCG GTCTGGAGCA CGAGGCGAAG CTCTTTGCCG 
3961 AGGCAGTGGT TGACCOGAAC GGTGGCAAGC GTGGTATTCG CGAGTTCCTC GACCGCCAGA 
4021 GTGCGCCGTT GCCAACCCGC CGACCATTGA TTACACCTGA ACAGGAGCAA CTCTTGCGCG 
4081 ATCAGAAAGA ACTGTTGCCG GTTGGTTCAC CCTTCTTCCC CGGTGTTGAC CGGATTCOGA 
4141? AGTGGCAGTA CGCGCAGGCG GTTATTCGTG ATCCGGACAC CGGTGOGGCG GCTCACGGCG 
4201 ATCCCATCGT GGCTGAAAAG CAGATTATTQ TGCCGGTGGA ACGCOCOOGC GCCAATCAGG 
4261 CGCTGATCTA TGTTCTGGCC TCGGAGGTGA ACTTCAAOGA TATCTGGGOG ATTACCGGTA 
4321 TTCCGGTGTC ACGGTTTGAT GAGCACGACC GCGACTGGCA CGTTACCGGT TCAGGTGGCA 
4381 TCGGCCTGAT CGTTGCGCTG GGTGAAGAGG CGCGACGCGA AGGCCGGCTG AAGGTGGGTG 
4441 ATCTGGTGGC GATCTACTCC GGGCAGTCGG ATCTGCTCTC ACCGCTGATG GGCCTTGATC 
4501 CGATG0CCGC CGATTTCGTC ATCCAGGGGA ACGACACGCC AGATGGATOG CATCAGCAAT 
4561 TTATGGTGGC CCAGGOCCCG CAGTGTCTGC CCATCCCAAC CGATATGTCT ATCGAGGCAG 
4621 CCGGCAGCTA CATCCTCAAT CTCGGTACGA TCTATOGCGC CCTCTTTAOG ACGTTGCAAA 
4681 TCAAGGCCGG ACGCACCATC TTTATOGAGG GTGCGGCGAC CGGTACOGGT CTGGACGCAG 
4741 CGOGCTCGGC GGCCCGGAAT GGTCTGOGOG TAATTGGAAT GGTCAGTTOG TCGTCAOGTG 
4801 CGTCTACGCT GCTGGCTGCG GGTGCOCAOG GTGCGATTAA CCGTAAAGAC CCGGAGGTT6 
4861 CCGATTGTTT CACGCGCGTG CCCGAAGATC CATCAGOCTG GGCAGCCTGG GAAGCC GOCG 
4921 GTCAGCCGTT GCTGGCGATG TTOCGGGCGC AGAACGACGG GCG ACTGG CC GATTATGTGG 
4981 TCTCGCACGC GGGCGAGACG GCCTTCCCGC GCAGTTTCCA GCTTCTCGGC GAGCC ACGCG 
S041 ATGGTCACAT TCCGACGCTC ACATTCTACG GTGCCACCAG TGGCTACCAC TTCACCTTOC 
5101 TGGGTAAGCC AGGGTCAGCT TCGCOGRCCG AGATGCTGCG GCGGGOCAAT CTCCGCGOOG 
5161 GTGAGGCGGT GTTGATCTAC TACGGGGTTG GGAGCGATGA CCTGGTAGAT ACCGGCGCTC 
5221 TGGAGGCTAT CGAGGCGGCG CGGCAAATGG GAGCGCGGAT CGTCGTCGTT. ACC6TCAGCG 
5281 ATGCGCAACG CGAGTTTGTC CTCTCGTTGG GCTTOGGGGC TGCCCTAOGT GGTGTCGTCA 
5341 GCCTGGCGGA ACTCAAAOGG CGCTTOGGCG ATGAGTTTGA GTGGOOGOGC ACGATGCCGC 
5401 CGTTGCCGAA CGCCCGCCAG CACCCGCAGG GTCTGAAAGA GGCTGTCOGC CGCTT CAACG 
5461 ATCTGGTCTT CAAGCCGCTA GGAAGCGCGG TCGGTGTCTT CTTGCGGAGT GCCGACAATC 
5521 CGCGTGGCTA CCCCGATCTG ATCATCGAGC GGGCTGCCCA CGATGCACTG GOGGTGAGOG 
5581 CGATGCTGAT CAAGCCCTTC ACCGGACGGA TTGTCTACTT CGAGGACATT GGTGGGOGGC 
5641 GTTACTCCTT CTTCGCROCG CAAATCTGGG TGCGCCAGOG CCGCATCTAC ATGCCGACGG 
5701 CACAGATCTT TGGTACGCAC CTCTCAAATG CGTATGAAAT TCTGCGTCTG AATGATGAGA 
5761 TCAGCGCCGG TCTGCTGAOG ATTACCGAGC CGGCAGTGGT GCCGTGGGAT GAACTAOOOG 
5821 AAGCACATCA GGCGATGTGG GAAAATOGCC ACACGGCGGC CACTTATGTG GTGAATCATG 
5881 CCTTACCACG TCTCGGCCTA AAGAACAGGG ACGAGCTGTA CGAGGCGTGG AOGGCCGGOG 
5941 AGCGGTAGCG CGGATGGGTA TTGAACAGGT AACGGA CGGA AGATCQUfcGC TTCCGTOCGT 
6001 TATCTTTTGG CCGTCGAAGC GTGCTGAGCC GATTATCGTT GOCGTGGTTG TCCCGATGGG 
6061 CAGAOGOGCT CGAACCAGAT GATACCACOG ACGGCTATCG TCACCAAftCC GGCGAAGAOC 
6121 AGGTAAGCCT CTGAAGGACG C (SEQ ID NO:3B) 
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i 

1 MIDTAPLAPP RAPRSNPIRD RVDWEAQRAA ALADPGAFHG AIARTVIHWY DPQHHCWIRF 

61 NESSQRWEGL DAATGAPVTV DYPADYQPWQ QAFDDSEAPF YRWFSGGLTN ACFNEVDRHV 

121 MMGYGDEVAY YFEGDRWDNS LNNGRGGPW QETITRRRLL VEWKAAQVL RDLGLKKGDR 

181 IALNMPNIMP QIYYTEAAKR LGILYTPVFG GFSDKTLSDR IHNAGARWI TSDGAYRNAQ 

241 WPYKEAYTD QALDKYIPVE TAQAIVAQTL ATLPLTESQR QTIITEVEAA LAGEITVERS 

301 DVMRGVGSAL AKLRDLDASV QAKVRTVLAQ ALVESPPRVE AWWRHTGQ EILWNEGRDR 

361 WSHDLLDAAL AKILANARAA GFDVHSENDL LNLPDDQLIR ALYASIPCEP VDAEYPMFII 

421 YTSGSTGKPK GVIHVHGGYV AGWHTLRVS FDAEPGDTIY VIADPGWITG QSYMLTATHA 

481 GRLTGVIAEG SPLFPSAGRY ASIIERYGVQ IFKAGVTFLK TVMSNPQNVE DVRLYDMHSL 

541 RVATFCAEPV SPAVQQFGMQ IMTPQYIKSY WATEHGGIVW TRFYGNQDFP LRPDAHTYPL 

601 PWVMGDVWVA ETDESGTTRY RVADFDEKGE IVITAPYPYL TRTLWGDVPG FEAYLRGEIP 

661 LRAWKGDAER FVKTYWRRGP NGEWGYIQGD FAIKYPDGSF TLHGRPDOVI NVSGRRMGTE 

721 EIEGAILRDR QITPDSPVGN CIWGAPHRE KGLTPVAFIQ PAPGRHLTGA DRRRLDELVR 

781 TEKGAVSVPE DYIEVSAFPE TRSGKYMRRF LRNMMLDEPL GDTTTLRNPE VLEEIAAKZA 

841 EWKRRQRMAE EQQIIERYRY FRIEYHPPTA SAGKLAWTV TNPPVNALNE RALDELNTIV 

901 DHLARRQDVA AIVFTGQGAR SFVAGADIRQ LLEEIHTVEE AMALPNNAHL AFRKIERMNK 

961 PCIAA1NGVA LGGGLEFAMA CHYRVADVYA EFGQPEINLR LLPGYGGTQR LPRLLYKRHN 

1021 GTGLLRALEM ILGGRSVPAD EALKLGLIDA IATGDQDSLS LACALARAAI GADGQLIESA 

1081 AVTQAFRHRH EQLDEWRKPD PRFADDELRS IIAHPRIERI IRQARTVGRD AAVHRALDAI 

1141 RYGIIHGFEA GLEHEAKLFA EAWDPNGGK RGIREFLDRQ SAPLPTRRPL ITPEQEQI*LR 

1201 DQKELLPVGS PFFPGVDRIP KWQYAQAVIR DPDTGAAAHG DPIVAEKQII VPVERPRANQ 

1261 ALIYVLASEV NFNDIWAITG IPVSRFDEHO RDWHVTGSGG IGLIVALGEE ARREGRLKVG 

1321 DLVAIYSGQS DLLSPIMGLD PMAADFVTQG tfDTPDGSHQQ FMLAQAPQCL PIPTDMSIEA 

1381 AGSYILNLGT IYRALFTTLQ IKAGRTIFIE GAATGTGLOA ARSAARNGIA VIGMVSSSSR 

1441 ASTLLAAGAH GAINRKDPEV ADCFTRVPEO PSAWAAWEAA GQPLLAMFRA QNDGRIADYV 

1501 VSHAGETAFP RSFQLIiGEPR DGHIPTLTFY GATSGYHPTF LGKPGSASPT EMLRRANIAA 

1561 GEAVLIYYGV GSDDLVDTGG LEAIEAARQM GARIWVTVS DAQREFVLSL GFGAALRGW 

1621 SLAELKRRFG DEFEWPRTMP PLPNARQDPQ GLKEAVRRFN DLVFKPLGSA VGVFLRSADN 

1681 PRGYPDLI IE RAAHDALAVS AMLIKPFTGR IVYFEDIGGR RYSFFAPQIW VRQRRIYMPT 

1741 AQIFGTHLSN AYEILRLNDE ISAGLLTITE FAWPWDELP EAHQAMWENR HTAATYWNH 

1801 ALPRLGLKNR OELYEAHTAG ER (SEQ ZD NO: 39) 
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Figurc29 

ATGAGTGAAGAGTCTCTGGTTCTOIGCACAATTCAAGGCCCCATCGC^ 

AATCGCCCCCAGGCCCTCAATGGGCTCAGTCCGGCCTTGATTGATGACCTCATTCGCCAT 

TTAGAAGCCTGCGATGCCGATGACACAATCCGCGTGATCATTATCACCGGCGGGGGACGG 

GCATTTGCTGCCGGCGCTGATATCAAAGCGATGGCCAATGCCACGCCTATTGATATGCTC 

AG<IAGTGGCATGATTGCGCGCTGGGCACGCATCGCCGCGGTGCGCAAACCGGTGATTGCT 

GC<X5TGAATGGGTATGCGCTCGGTGGTGGTTGTGAATTGGCAATGATGTGCGACATCATC 

ATCGCCAGTGAAAACGCGCAGTTCGGACAAC^GAAATCAATCTGGGCATCATTCCCOTT 

GCTGGTGGCACCCAACGGCTGAC<K2GCGCCCTTGGCCCGTATCGCGCAAT 

CT{SACCG<k&CGACCATCAGTGCT^ 

TGCCCGCCTGAAAGCCTGCTCGATGAAGCCCGTCGGATCGCGCAAACCATTGCCACCAAA 
TCACCACTGGCTGTACAGTTGGCGAAAGAGGCAGTCCGTATGGCCGCCGAAACCACTGTG 
CGCGAGGGGTTGGCTATCGAGCTGGGTAACTTCTATCTGCTGTTTGCCAGTG CTGAG CAA 
AAAGAGGGGATGCAGGCATTTATCGAGAAACGCGCTCCCAACTTCAGTGGTCGTTGA 
(SEQ ID NO: 40) 
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Figure 30 

MSEESLVLST I EGPI AILTLNRPQALNALSPMjI DDLIRHLEAGDADDT IRVI I ITGAGR 
AFAAG AD I KAMAN AT PI DMLT S GMI ARW ARI AAVRKPVI AAVNG Y ALGGGCELAMMCD I 1 
I ASENAQFGQPEINLGI I PG AGGTQRLTRALGP YRAMELILTGATIS AQEAIAHGLVCRV 
CPPESLLDEARRIAQTIATKSPIAVQLAKEAVRMAAETTV^ 
KEGMQAFIEKRAPNFSGR (SEQ ID NO: 41) 
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Figure 31 



GGCGTAATCCGACCGGCAGGTTACGGTCTTCTACTGGGGTCAAGGCGCGTCTCCTTTTGG 

TGGCGCGAGCAAG<XGGCTTTTCCTGGCTTCAATGTACCATAGAGCGGTTACTTCGTGCA 

ACGGGCGTGGTACAATCGAGAGCAACCTTTCGCAAAAGCTATCCAATCCTGCACACGTGC 

ATCTGTTACAGGGTATTATTGTCGGCAAACGACAGTCCTGTCGTTTATGTACAAGGAGAT 

CAACGTATGAGTGAAGAGTCTCTGGTTCTCAGCACAATTGAAGGCCCCATCGCCATCCTC 

ACCCTCAATCGCCCCCAGGCCCTCAATGCGCTCAGTGCGGCCTTGATTGATGACCTCATT 

CGCCATTTAGAAGCCTGCGATGGCGATGACACAATCXGCGTGATCATTATCACCGGCGCC 

GGACGGGCATTTGCTGCCGGCGCTGATATCAAAGCGATGGCCAATGCCACGCCTATTGAT 

ATGCTCACCAGTGGCATGATTGCGCGCTGGGCACGCATGGGCGCGGTGCGCAAACCGGTG 

ATTGCTGCCGTGAATGGGTATGCGCTCGGTGGTGGTTGTGAATTGGCAATGATGTGCGAC 

ATCATCATCGCCAGTGAAAA<X3CGCAGTTCGGACAACCGGAAATCAATCTGGGCATCATT 

CCCGGTCCTGGTGGCACCCAACGGCTGACCCGCGCCCTTGGCCCGTATCGCGCAATGGAA 

TTGATCCTGACCGGCGCGACCATCAGTGCTCAGGAAGCTCTCGCCCACGGCCTGGTGTGC 

CGGGTCTGCCCGCCTGAAAGCCTGCTCGATGAAGCCCGTCGGATCGCGCAAACCATTGCC 

ACCAAATCACCACTGGCTGTACAGTTGGCGAAAGAGGCAGTCCGTATGGCCGCCGAAACC 

ACTGTGCGCGAGGGGTTGGCTATCGAGCTGCGTAACTTCTATCTGCTGTTTGGCAGTGCT 

GACCAAAAAt^GGGGATGCAtSGCATTTATC^^ 

TACTACXAGATGATCGAGCAGTAAAGGGTAAATftCTCTATCAATCTGGCCAGATAAQCGG 
TTGGGTAACAACGCAATGCTGCAAAGGAGACGATCATGGACATACACGAGCGATTGCGAT 
CTCTCGAACGCGAiyffiTGCT— (SEQ~ID NOr42t " 
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Figure 32 

-atgagtga — agagt— — 

— atgacgta — cgaaa- 



atggccgccctgcgtgt— cctgctgtcctgcgcccgcggcc 

atggcggccctgcgtgctctgctgcccagagc — — — 

tctc-agcacaattgaa 

-ctggtcgagcgc gat 

— — gttcgc-tgtcccgcctgg 
gcaactcgctgttgtccccagttcgc-tgcccagaattc 

37 ggccccatcgcc— — ~ — — — -atcctcacc— — — 
3 4 cagcgagttggc-- - ■ — — -« — attatcacg-- — — — 




123 agggaagaataacaccgtggggttgatccaac- 



-gaaaagaaaggaaagaata 



-tcaatcgcccccaggccctcaatgcgctc 
-tgaaccgtccccaggcactgaacgcgctc 



tgaaccgccccaaggccctcaatgcactt 

134 gcagcgtggggctgatccagttgaaccgtcccaaagcactcaatgcactt 

agtccggccttgattgatgacctcattc — gccatttagaagcctgcgat 
a — acagccagg — tgatgaacgaggtc — acca — gcgctgcaaccgaa 



-atccgcgtgatcattatcaccggcgccggacg 



-tgcc 
-cctg 
bttcc 
-ccgg 



acgttcgccgacgcgttcaccgccgacttcttcgccacc tggggcaa 

aggactgtt— actccagcaagttcttgaagcac — -tggggcca 



gctggccgccgtgcgcaccccgacgatcgccgcggtggcgggatacgcgc 
cctcacccaggtcaagaagccagtcatcgctgctgtcaatggctatccgt 
tatcacccggatcaagaaaccggtcatcgcggctgtcaatggctatgctc 



tcggcggtggctgcgagctggcgatgatgtgcgacgtgctgatcgccgcc 
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SEQ ID 110:40 370 gaaaacgcgcagttcggacaaccggaaatcaatctgggcatcattcccgg 

SEQ ID NO: 43 367 gacaccgcgaagttcggacagcccgagataaagctgggcgtgctgccagg 

SEQ ID NO: 44 466 gagaaggcccagtttgcacagccggagatcttaataggaaccatcccagg 

SEQ ID NO: 45 466 gagaaagcccagtttggacagccagaaatcctcctggggaccatcccagg 

SEQ ID NO: 40 420 tgctggtggcacccaacggctgacccgcgcccttggcccgtatcgcgcaa 

SEQ ID NO:* 4 3 417 catgggcggctcccagcggctgacccgggctatcggcaaggctaaggcga 

SEQ ip NO: 44 516 tgcaggcggcacccagagactcacccgtgctgttgggaagtcgctggagc 

SEQ ID NO: 45 516 tgcagggggcactcagagactcacccgagcagtcggcaaatcactagcaa 

SEQ ID NO: 40 470 tggaattgatcctgaccggcgcgaccatcagtgctcaggaagctctcgcc 

SEQ ID NO: 43 467 tggacctcatcctgaccgggcgcaccatggacgccgccgaggc-cgagcg 

SEQ V ID NO: 44 566 tggagatggtcctcaccggtgacgcgatctcagcccaggacgc-caagca 

SEQ ID NO: 45 566 tggagatggtcctcactggtgaccgaatttcagcacaggatgc~caagca 

SEQ ID NO: 40 520 ca-c-ggcctggtgtgccgggtctgcccgcctgaaagcctgctcgatgaa 

SEQ ID NO: 43 516 cagc-ggtctggtttcacgggtggtgccggccgacgacttgctgaccgaa 

SEQ ID NO: 44 615 ag-caggtcttgtcagcaagatttgtcctgttgagacactggtggaagaa 

SEQ ID NO: 45 615 ag-caggtcttgtaagcaagatttttcccgttgaaacactggttgaagag 

SEQ ID NO: 40 568 gcccgtcggatcgcgcaaaccattgccaccaaatcaccactggctgtaca 

SEQ ID NO: 43 565 gccagggccactgccacgaccatttcgcagatgtcggcctcggcggcccg 

SEQ ID NO: 44 664 gccatccagtgtgcagaaaaaattgccagcaattctaaaattgtagtagc 

SEQ ID NO: 45 664 gccatccaatgtgcagaaaagatcgccaacaattccaagatcatagtagc 

SEQ ID NO: 40 618 gttggcgaaagaggcagtccgtatggccgccgaaaccactgtgcgcgagg 

SEQ ID NO: 43 615 gatggccaaggaggccgtcaaccgggctttcgaatccagtttgtccgagg 

SEQ ID NO: 44 714 gatggccaaagaatcagtgaatgcagcttttgaaatgacattaacagaag 

SEQ ID NO: 45 714 catggcgaaagaatctgtgaatgcagcctttgaaatgacgttaacagaag 

SEQ ID NO: 40 668 ggttggctatcgagctgcgtaacttctatctgctgtttgccagtgctgac 

SEQ ID NO: 43 665 ggctgctctacgaacgccggcttttccattcggctttcgcgaccgaagac 

SEQ ID NO: 44 764 gaagtaagttggagaagaaactcttttattcaacctttgccactgatgac 

SEQ ID NO: 45 764 gaaataagctggagaagaagctcttctattccacctttgccactgatgac 

SEQ ID NO: 40 718 caaaaagaggggatgcaggcatttatcgagaaacgcgctcccaacttcag 

SEQ ID NO: 43 715 caatccgaaggtatggcagcgttcatcgagaaacgcgctccccagttcac 

SEQ ID NO: 44 814 cggaaagaagggatgaccgcgtttgtggaaaagagaaaggccaacttcaa 

SEQ ID NO: 45 814 cggagagaagggatgtctgcctttgtggagaaaaggaaggccaacttcaa 

SEQ ID NO: 40 768 tggtcgttga ' 

SEQ ID NO: 43 765 ccaccgatga 

SEQ ID NO:44 864 agaccagtga 

SEQ ID NO: 45 864 agaccactga 
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Figure 33 
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1 -maeealv latiegp 

1 -mtyetil ver-dqr- 

1 -maalrvl- lacargplxppvrcpawrpfaaganfeyilaekrg 

1 maalrallpracnaUspvrcpefrrfasganfqyiitekkgknss 

IS iailtlnrpqalnalspaliddlirhleacdaddtirvititgagr 

1 4 vgiitlnxpqalnalns qvmnevt saa teldddpdigaiii t gsak 

43 knntvgliqlnrpkalnalcdglidelnqalkifeedpavgaivltggdk 
47 vgliqlnrpkalnalcnglieelnqaletfeedpavgaivltggek 

61 afaagadikamanatpidmltsgmiarwariaavrlqyviaavngyalggg 

60 afaagadikemadltfadaftadffatwgklaavrtptiaavagyalggg 

93 afaagadikemqnlsfqdcyeskflkhwdhltqvkkpviaavngyafggg 

93 afaagadikemqnrtfqdcysgkflshwdhitrikkpviaavngyalggg 

111 celamcdliiaaenaqfgqpelnlgiipgaggtqrltralgpyraineli 
110 celanmcdvliaadtakfgqpeiklgvlpgmggsqrltraigkakaittdli 
143 celannncdiiyagekaqfaqpeiligtipgaggtqrltxavgkalamemv 
143 celammcdiiyagekaqfgqpeillgtipgaggtqrltravgk8lameniv 

161 ltgatiaaqealahglvcrvcppeslldearriaqtlatkaplavqlake m 
160 ltgrtandaaeaersglvsrvvpaddlltearatattisqmsaaaarxnake 
193 ltgdriaaqdakqaglvaltl^vetlveealqcaekiaanakivvaxiiako 
193 ltgdrisaqdak^glvskifpvetlveeatqcaekiannskilvamake 

211 avrmaaettvreglaielrnfyllfasadqkegmqafiekrapnfsgr 
210 avnrafeaalaegllyerrlfhaafatedqeegmaafiekrapqfthr 
243 svnaafemtltegsklekklfyatfatddrkegmtafvekrkanfkdq 
243 svnaafemtltegnUekklfystfatddrregmaafvekrkanfkdh 
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Figure 39 

ATGATCGACACTGCGCCCCTTGCCCCACCACGGGCGCCCCGCTCTAATCCGATTC<3GGAT 
CGAGTTGATTGGGAAGCTCAGCGCGCTGCTGCGCTGGCAGATCCCGGTGCCTTTCATGGC 
-GCGATTGCCCGGACAGTTATCCACTGGTACGACCCACAACACCATTGCTGGATTCGCTTC 
AACGAGTCTAGTCAGCGTTGGGAAGGGCTGGATGCCGCTAGCGGTGCCCCTGTAACGGTA 
GftCTATCCCGCCGATTATCAGCCCTGGCAACAGGCGTTTGATGATAGTGAAGCGCCGTTT 
TACCGCTGGTTTAGTGGTGGGTTGACAAATGCCTGCTTTAATGAAGTAGACCGGCATGTC 
ATGATGGGCTATGGCGACGAGGTGGCCTACTACTTTGAAGGTGACCGCTGGGATAACTGG 
CTCAACAATGGTGGTGGTGGTCCGGTTGTCCAGGAGACAATCACGCGGCGGCGCCTGTTG 
-GTGGAGGTGGTGAAGGCTGCGCAGGTGTTGCGTGATCTGGGCCTGAAGAAGGGTGATCGG 
ATTGCTCTGAATATGCCGAATATTATGCCGCAGATTTATTATACGGAAGCGGCAAAACGA 
CTGGGTATTCTGTACACGCCGGTCTTCGGTGGCTTCTCGGACAAGACTCTTTCCGACCGT . 
ATTCACAATGCCGGTGCACGAGTGGTGATTACCTCTGATGGTGCGTACCGCAACGCGCAG 
GTGGTGCCCTACAAAGAAGCGTATACCGATCAGGCGCTCGATAAGTATATTCCGGTTGAG 
ACGGCGCAGGCGATTGTTGCGCAGACCCTGGGCACCTTGCCCCTGACTGAGTCGCAGCGC 
CAGACGATCATCACCGAAGTGGAGGCCGCACTGGCCGGTGAGATTACGGTTGAGCGCTCG 
GACGTGATGCGTGGGGTTGGTTCTGCCCTCGCAAAGCTCCGCGATCTTGATGCAAGCGTG 
CAGGCAAAGGTGCGTACAGTACTGGCGCAGGCGCTGGTCGAGTCGCCGCCGCGGGTTGAA 
GCTGTGGTGGTTGTGCGTCATACCGGTCAGGAGATTTTGTGGAACGAGGGGCGAGATCGC 
TGGAGTCACGACTTGCTGGATGCTGCGCTGGCGAAGATTCTGGCCAATGCGCGTGCTGCC 
GGCTTTGATGTGCACAGTGAGAATGATCTGCTCAATCTCCCCGATGACCAGCTTATCCGT 
GCGCTCTACGCCAGTATTCCCTGTGAACCGGTTGATGCTGAATATCCGATGTTTATCATT 
TACAGATCGGGTAGCACCGGTAAGCCCAAGGGTGTGATCCACGTTCACGGCGGTTATGTC 
GCCGGTGTGGTGCACACCTTGCGGGTCAGTTTTGACGCCGAGCCGGGTGATACGATATAT 
GTGATCGCCGATCCGGGCTGGATCACCGGTCAGAGCTATATGCTCACAGCCACAATGGCC 
GGTCGGCTGACCGGGGTGATTGCCGAGGGATCACCGCTCTTCCCCTCAGCCGGGCGTTAT 
GCCAGCATCATCGAGCGCTATGGGGTGCAGATCTTTAAGGCGGGTGTGACCTTCCTCAAG 
ACAGTGATGiTCC^TCCGCAGAATGTTGAAGATGTGCGA 

cgggttgcaaccttctgcgccgagccggtcagtccggcggtgcagcagtttggtatgcag 
atcatgaccccgcagtatatcaattcgtactgggcgaccgagcacggtggaattgtctgg 

ccctgggtgatgggtgatgtctgggtggccgaaactgatgagagcgggacgacgcgctat 
cgggtcgctgatttcgatgagaagggcgagattgtgattaccgccccgtatccctacctg 
acccgcacactctggggtgatgtggccggtttcgaggcgtacctgcgcggtgagattccg 
ctgcgggcctggaagggtgatgccgagcgtttcgtcaagacctactggcgacgtgggcca 
aacggtgaatggggctatatccagggtgattttgccatcaagtaccccgatggtagcttc 
acgctccacggacgccctgacgatgtgatcaatgtgtcgggccacggtatgggcaccgag 
gagattgagggtgccattttgcgtgaccgccagatcacgcccgactcgcccgtcgctaat 
tgtattgt<^tcggtgcxk:<^k^ccgtgagaagggtctgaccccggttgccttcattcaa 
cctgcgcctggccgtcatctgaccggcgccgaccggcgccgtctcgatgagctggtgcgt 
accgagaagggggcggtcagtgtcccagaggattacatcgaggtcagtgcctttcccgaa 
acccgcagcgggaagtatatgcggggctttttgcgcaatatgatgctcgatgaaccactg 
ggtgatacgacgacgttgcgcaatcctgaagtgctcgaagagattgcagccaagatcgct 
gagtggaaacgccgtcagcgtatggccgaagagcagcagatcatcgaacgctatcgctac 
ttccggatcgagtatcacccaccaacggccagtgcgggtaaactcgcggtagtgacggtg 
acaaatccgccggtgaacgcactgaatgagcgtgcgctcgatgagttgaacacaattgtt 

GACCACCTGGCCCGTCGTCAGGATGTTGCCGCAATTGTCTTCACCGGACAGGGCGCCAGG 
AGTTTTGTCGCCGGCGCTGATATTCGCCAGTTGCTCGMGAGATTCATACGGTTGAAGAG 
GCAATGGCCCTGCCG AAT AACGCCCAT CTTGCTTTCCGC AAGATT GAGCGT AT G AAT AAG 
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CCGT€TATCGC<3GCGATCAACGGTGTGGCGCTGGGTGGTGGTCTGGAATTCGCCATGGCC 

TGCCATTACCGGGTTGCCGATGTCTATGCCGAATTCGGTCAGCCAGAGATTAATCTGGGC 

TTGCTACCTGGTTATGGTGGCACGCAGCGCTTGGCGCGCCTGTTGTACAAGCGCAACAAC \ 

GGCAGCGGTCTGCTCCGAGCGCTGGAGATGATTCTGGGTGGGCGTAGCGTACCGGCTGAT 

GAGGCGCTGAAGCTGGGTCTGATCGATGCCATTGCTACCGGCGATCAGGACTCACTGTCG 

CTGGCATGOSCGTTAGCCCGTGCCGCAATCGGCGCCGATGGTCAGTTGATCGAGTCGGCT 

GCGGTGACCCAGGCTTTCCGCCATCGCCACQAGCAGCTTGACGAGTGGCGCAAACCAGAC 

CCGCGCTTTGCCGATGACGAACTGCGCTCGATTATGGCCCATCCACGTATCGAGCGGATT 

ATCCGGCAGGCCCATACCGTTGGGCGCGM 

CGCTATGGCATTATCCACGGCTTCGAGGCCGGTCTGGAGCAG<3AGGCGAAGCTCTTTGCC 

GAisGCAGTGGTTGACCCGAACGGTGGCAAGCGTGGTATTCGCGAGTTCCTCGACCGCCAG 

AGTGCGCCGTTGCCAACCCGCCGACCATTGATTACACCTGAACAGGAGCAACTCTTGCGC • 

GATCAGAAAGAACTGTTGCCGGTTGGTTCACCCTTCTTCCCCGGTGTTGACCGGATTCCG 

AAGTGGCAGTACGCGCAGGCGGTTATTCGTGATCCGGACACCGGTGCGGCGGCTCACGGC 

GATCCCATCGTGGCTGAAAAGCAGATTATTGTGCCGGTGGAACGCCCCG<3CGCCAATCAG 

GCGCTGATCTATGTTCTGGCCTCGGAGGTGAACTTCAACGATATCTGGGCGATTACCGGT 

ATTCCGGTGTCACGGTTTGATGAGCAC<^CCGCGACTGGCACGTTACCGGTTCAGGT<3GC 

ATCGGCCTGATCGTTGCGCTGGGTGAAGAGGGGCGACGCGAAGGCCGGCTiSAAGGTGGGT 

GATCTGGTGGCGATCTACTCCGGGCAGTCGGATCTGCTCTCACCGCTGATGGGCCTTGAT 

CCGATGGCCGCCGATTTCGTCATCCAGG(^AACGA<^^ 

TTTATGCTGGCCCAGGCCCCGCAGTGTCTGCCCftTCCCAACCGATATGTCTATCGAGGCA 
GCCGGCAGCTACATCCTCAATCTCGGTACGATCTATCGCGCCCTCTTTACGACGTTGCAA 
ATC^GGCCGGACGCACC^TCTTTATCGAGGGl^ 

GCGCGCTCGGCGGCCCGGAATGGTCTGCGCGTAATTGGAATGGTCAGTTCGTCGTCACGT 
GCCTCTACGCTGCTGGCTGCGGGTGCCCACGGTGCGATTAACCGTAAAGACCCGGAGGTT 
GCCGATTGTTTCACGCGCGTGCCCGAAGATCCATCAGCCTGGGCAGCCTGGGAAGCCGCC 
GGTCAGCCGTTGCTGGCGATGTTCCGGGCGCAGAACGACGGGCGACTGGCCGATTATGTG 
GTCTCGCACGCGGGCGAGACGGCCTTCCCGCGCAGTTTCCAGCTTCTCGGCGAGCCACGC 
^ATGGTCACATTCCGACGCTCACATTCTACGGTGCCACCAGTGGCTACCACTTCACCTTC 
CTGGGTAAGCCAGGGTCAGCTTCGCCGACCGAGATGCTGCGGCGGGCCAATCTCCGCGCC 
GGTGAGGCGGTGTTGATCTACTACGGGGTTGGGAGCGATGACCTGGTAGATACCGGCGGT 
CTGGAGGCTATCGAGGCGGCGCGGCAAATGGGAGCGCGGATCGTCGTCGTTACCGTCAGC 
<5ATGCGCAACGCGAGTTTGTCCTCTCGTTGGGCTTCGGGGCTGCCCTACGTGGTCTCCTC 
AGCCTGGCGGAACTCAAACGGCGCTTCGGCGATGAGTTTGAGTGGCCGCGCACGATGCCG 
CCGTTGCCGAACGCCCGCCAGGACCCGCAGGGTCTGAAAGAGGCTGTCCGCCGCTTCAAC 
GATCTGGTCTTCAAGCCGCTAGGAAGCGCGGTCGGTGTCTTCTTGCGGAGTGCCGACAAT 
CCGCGTGGCTACCGCGATCTGATCATCGAGCGGGCTGCCCACGATGCACTGGCGGTGAGC 
GCGATGCTGATCAAGCCCTTCACCGGACGGATTGTCTACTTCGAGGACATTGGTGGGCGG 
CGTTACTCCTTCTTCGCACCGCAAATCTGGGTGCGCCAGCGCCGCATCTACATGCCGACG 

ATCAGCGCCGGTCTGCTGACGATTACCGAGCCGGCAGTGGTGCCGTGGGATGAACTACCC 
GAAGCACATCAGGCGATGTGGGAAAATCGCCACACGGCGGCCACTTATGTGGTGAATCAT 
GCCTTACCACGTCTCGGCCTAAAGAACAGGGACGAGCTGTACGA^ 
GAGCGGTAG (SEQ ID NO: 129) 
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Figure 40 



SEQ ID NO: 39 1 midtaplappraprsnpirdrvdwe 

SEQ ID NO: 130 1 mglpeervrsgsgsrgqeeagaggrarswsp — ppevsrsahvpslqryr 

SEQ ID NO: 131 1 mslelkekeselpfdeqiind 

PL PP RS P 



SEQ ID NO: 39 26 aqraaaladpgafhgaiartvihwydpqhhcwirfnessqrwegldaatg 
SEQ ID NO: 130 49 elhrrsveeprefwgdiake-fywktpcpgpflryn- 

SEQ ID NO: 131 22 kwrs kytpidayf kfhrqtvenleaf— 

R PFGIATIWYPH R NES WE 



SEQ ID NO:39 76 apvtvdypadyqpwqqafddseap-fyrw£sggltnacfnevdrhvm-ing 

SEd ID NO: 130 84 fdvtkgkif iewmkgattnicynvldrnvhekk 

SEQ ID NO: 131 52 -akelew f kpwdkvldasnpp-fykwfvggrlnlaylavdrhv*-tw 

P« FDS PFYWF66TNCN VDRHV 



SEQ ID NO:39 
SEQ ID NO: 130 
SEQ ID NO: 131 



SEQ ID NO:39 
SEQ ID NO: 130 
•SEQ ID NO: 131 



124 ygdevayy f egdrwdns lnngrggpwqetitrrrllvewkaaqy^r-d 

117 lgdkva f ywegne — pgettqityhqllvqvcqfsnvlr-k 

96 rknklaiewegepvden gyp tdr r kltyydlyrevnrvayml kqn 

GD VA Y EG D G P IT LLVEV A VLR 

173* lglkkgdrlalninpniiiq>qiyyte-aaltrlgilytpvfggfed3ctlsdri 
155 qgiqkgdrvaiyn^mipelwaml-acarigalhaivfagfsaeslceri 
141 f gvkkgdkitlylp-mvpeipitmlaawrigaitsvvfsgf oadalaerl 
G KKGDRIAL MP I P T AA ft G L VF GFS X.RX 



SEQ ID NO: 39 222 hnagarvvitsdgayrnaqvvpykeaytdqal— — dJcyipvetaqaiva. 
SEQ ID NO: 130 204 ldsscsllittdafyrgeklvnlkel-adealqkcqekgfpvrc — ciw 

SEQ ID NO: 131 190 ndsqsrivltadgfwrrgrwrlkev—— ' — 

R VIT DG YR W- KE DAL K PV IV 



SEQ ID NO: 39 266 qtlatlpltesgrqtiiteveaalageitversdvmrgvgsalakirdld 

SEQ ID NO: 130 251 khlgrael : -gmgdats « 

SEQ ID NO: 131 216 : vdaal 



V AAL 



G G 



SEQ ID NO: 39 318 asvqakvrtvlaqalvespprveav v v vrk tg-qeilwnegrdrwshdll 

SEQ ID NO: 130 266 — qsppikrscpdv— — qiswnqgidlwwhelm 

SEQ ID NO: 131 221 ekatgvesvlvlprlglkdvpmtegrdywwnkln 

ESPP VE V W G I WNEGRD W H L 



SEQ ID NO:39 
SEQ ID NO: 130 
SEQ ID NO: 131 



367 daalakilanaraagfdvhsendllnlpddqllralyaslpcep — vdae 

255 g-^ — — — —-gtppn— — ayiepep— vese 

A P D A I CEP VDAE 



SEQ ID NO:39 
SEQ ID NO: 130 
SEQ ID NO: 131 



SEQ ID NO: 39 
SEQ ID NO:130 
SEQ ID NO:131 



415 ypmfiiytsgstgkpkgvihvhggyvagvvhtlrvsfdaepgdtiyvtad 
309 c^>lfilytsgstgkpkgwhtvggymlyvattflcyv£dfhaedv£«fctad 
272 hpsfAlytsgttg3^>kglvhdtggwa\iivyatmtafvfd irddd tfWctad 
P FI YTSGSTGKPKGV H GGY V T FD . D AD 

4 65 pgwi tgqsyml ta tmagrl tgviaegsplfpsagryasiierygvqif ka 
359 igwitghsyvtygplangatsvlfeglptypdviurlwsivdkykvtkfyt 
322 igwvtghsywlgpllmgateviyegapdypqpdrww8iierygvtifyt 
GWITG SY A T VI EG P P R SIIERYGV IF 
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SEQ ID NO; 39 515 gvtf Iktvmsnpqnvedvrlydmhslrvatfcaepvspavqqf gmqimtp 
SEQ ID NO: 130 409 aptairllmkfgd — epvtkhsraslqvlgtvgepinpeavlwyhrwga 
SEQ ID NO: 131 372 sptairmfmryge — ewprkhdlstlriihsvgeptnpeawrvayrvlgn 
T M B*VR D SIAV EP P 

SEQ ID NO: 39 565 q yi nsywatehggiwthfygnqdfplrpdahtyplpwvmgdvw 

SEQ ID NO: 130 457 qrcpiv dtfwqtetgghaltplpgat — pmkpgaatfp ffgva 

SEQ ID NO: 131 420 e kvafgstwwmtetggivishapglylvpmkpgtngpplpgfevdv- 

Q « TE GGIV TH G P P T PI*P DV 

SEQ ID NO: 39 609 vaetdeagttryrvadfdekgeivitapypyltrtlvgdvpgf eaylrge 

SEQ ID NO: 130 498 pailnesg — --eelegeaegylvf kqpwpgiartvy— — — — 

SEQ' ID NO: 131 466 vdengnp appgv kgy 1 vi kkpwpgmlhgiw — 

A DESG A KG VI P P RT V 

SEQ ID NO: 39 6S9 iplrawkgdaerfvktywrrgpngewgyiqgdf aikypdgaf tlhgrpdd 

SEQ ID NO: 130 531 — — gnherfettyf kkfpg yyvtgdgcqrdqdgyywitgridd 

SEQ ID NO: 131 496 gdperyiktywsrfpg mfyagdyaikdkdgyiwvlgrede 

GD ERF KTTO R P Y GD AIK DG GR DO 



SEQ ID NO: 39 
SEQ ID NO: 130 
SEQ ID NO: 131 



SEQ ID NO:39 
SEQ-ID NO: 130 
SEQ ID NO: 131 



709 vinvsghrmgteeiegailrdrqitpdspvgnctwgaphrekgltpvaf 

57 1 mlnvsghll 8 taevesalve hea vaeaawghphpvkgeclycf 

53 6 vi kvaghr lg tyelesali shpavaesa wgvpdai kgevpiaf 

* VINVSGHR GTEEA V WGPHKGPAP 

759 1 qpapgrhl t gadr r r ldelvrtekgavs vpedy ie- vsafpe tr agkyia 
615 vtlcdghtfspklteelkkqirekigpiatp-dyiqnapglpktrsgklm 
580 wlkqgvapsdelrkelrehvrrtigpiaepaqif f-vtklpktrsgklm 
G RLEVRG PDWVP. TRSGK M 



SEQ ID NO: 39 808 rrflrnniml-deplgdtttlrnpevleeiaakiaewkrrqrniaeeqqiie 

SEQ ID NO: 130 664 rrvlrkiaqndhdlgdmstvadpsvi . 

SEQ ID NO: 131 629 rrllkavat-gaplgdvtt 

RR LR D PLGD TT P V 

SEQ ID NO: 39 857 ryryfrieyhppt^sagklavvtvtnppvrialneraldelntivdhlarr 

SEQ ID NO;130 690 

SEQ ID NO: 131 647 t m : 



SEQ ID NO: 39 907 qdvaaivftgqgarsfvagadlrqlleethtveeamalpnnahlaf rkie 

SEQ ID NO: 130 690 shl 

SEQ ID NO: 131 647 ledeteveeak- 



I£ 



VEEA 



HL 



SEQ ID NO: 39 957 rmnkpciaaingvalggglefamachyrvadvyaefgqpeinlrllpgyg 

SEQ ID NO: 130 693 

SEQ ID NO: 131 658 - 



SEQ ID NO:39 
SEQ ID NO: 130 
SEQ ID NO: 131 



1007 
693 
658 



gtqrlprllykrnngtgllralemilggrsvpadealklglidaiatgdq 

raye ■ — ■ - 

RA E 



SEQ ID NO: 39 1057 dslslacalaraaigadgqliesaavtqafrhrheqldewrkpdprfadd 

SEQ ID NO: 130 693 ffl 

SEQ ID NO: 131 662 

F BR 
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SEQ ID NO:39 1107 elrsiiahprieriirqahtvgrdaavhraldairygiihgfeaglehea 

SEQ ID NO: 130 691 ] 

SEQ ID NO: 131 662 

SEQ ID NO: 39 1157 klfaeawdpnggkrgtrefldrqsaplptrrplltpeqeqllrdqkell 

SEQ ID NO: 130 697 

SEQ ID NO: 131 662 " ~ 

SEQ ID NO: 39 1207 pvgspffpgvdripkwqyaqavirdpdtgaaahgdpivaekqiivpverp 

SEQ ID NO: 130 697 ! 

SEQ ID NO: 131 662 : ~ ! 

SEQ ID NO: 39 1257 ranqaliyvlasevnfndiwaltgipvsrfdehdrdwhvtgsggigliva 

SEQ ID NO: 130 697 

SEQ ID NO: 131 662 — — 

SEQ ID NO: 39 1307 lgeearregrlkvgdlvaiysgqsdllspljagldpmaadfviqgndtpdg 

SEQ ID NO: 130 697 " 

SSQ^ID NO: 131 ... _662 — — — ~ — ~" ~ 

SEQ ID NO: 39 1357 shqqfittlaqapqclptptdmsieaagayllnlgtlyralfttlqiltagrt 

SEQ ID NO: 130 697 ; cl rrr-i :-tiq 

SEQ ID ,NO:131 . 662 -rrr-^^ 

CL T QIKA 

SEQ ID NO: 39 1407 ifiegaatgtgldaaraaarnglrvlgmvssssrastllaagahgainrk 

SEQ ID NO: 130 702 ~ 

SEQ ID NO: 131 666 - : — : * 

SEQ ID NO: 39 1457 dpevadcftrvpe^aawaaweaagqpllaxnfraqndgrladywshage 

SEQ ID NO: 130 702 T ' 

SEQ ID NO: 131 666 -— : " 

SEQ ID NO:39 -1507 tafprsfqllgeprdghiptltfygatBgyhftXlg^gsasptemlrra 

SEQ ID NO: 130 702 L " " " " " 

SEQ ID NO: 131 666 ; 

SEQ ID NO: 39 1557 nlrageavliyygvgsddlvdtgglealeaarqmgarivwtV8daqref 

SEQ ID NO: 130 702 

SEQ IO NO: 131 666 

SEQ ID NO: 39 1607 vlslgfgaalrgwslaellcrrfgdefeHprtinpplpnarqc^qgllceav 

SEQ ID NO: 130 702 — . 

SEQ ID NO: 131 666 e**** " 

E RT 

SEQ ID NO: 39 1657 rrfndlvfl^lgsavgvflrsadnprgyp<ai ieraahd alav6ainli^ 

SEQ ID NO: 130 702 

SEQ ID NO:131 671 : : 
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SEQ ID NO: 39 X707 ftgrivyfediggrrysffapqiwvrqrriyraptaqifgthlsnayeilr 

SBQ ID NO: 130 702 ' — ■ 

SEQ ID NO: 131 -611 

SEQ ID NO: 39 1757 lndeisaglltitepawpwdelpeahqamwenrhtaatywnhalprlg 

SEQ ID NO: 130 702 

SEQ ID NO: 131 €71 — - " ~ ' 

SEQ ID NO: 39 .1807 lknrdelyeawtager 

SEQ ID-N0:130 702 ' 

SBQ ID NO: 131 671 
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Figure 41 

SEQ ID 190:39 1 midtaplappraprsnpirdrvdweaqraaaladpgafhgaiartvihwy 

SBQ ID NO: 132 1 — : 

SEQ ID NO: 133 1 

SEQ ID NO: 39 51 dpqhhcwirfnessqrv#egldaatgapvtvdypadyqpwqqafddaeapf 

SEQ ID NO: 132 1 

SEQ ID NO: 133 1 «nd 



SEQ ID NO: 39 101 yrwfsggltnacfnevdrhvmmgygdevayyfegdrwdnslnngrggpvv 

SEQ ID NO: 132 1 — — melon 

SEQ ID NO:133 3 fnnv~— - 



FN V 



LNN 



SEQ ID NO: 39 151 qetltrrrllvevvkaaqvlrdlglkkgdrialnmpnlmpqiyyteaakr 

SEQ ID NO:132 6 

SEQ ID NO: 133 7 llnkddgial 

LKDIAL 



SEQ ID NO:39 201 lgilytpvfggf sdktlsdrihnagarwltsdgaymaqwpykeaytd 

SEQ ID NO: 132 6 • 

SEQ ID NO:133 17 



SEQ ID NO: 39 251 qaWkyj^etaxjalvaqtiatipltesqrqtliteveaalageitvera 

SEQ ID NO: 132 6 vileke 

SEQ ID NO:133 17 - 

I E E 

SEQ ID NO:39 301 dvnrgvgsalaklrdldasvqalcvrtvlaqalvespprveawwrhtgq 

SEQ ID NO: 132 12 

SEQ ID NO: 133 17 



SEQ ID NO: 39 
SEQ ID 190:132 
SEQ ID NO: 133 



SEQ ID NO:39 
SEQ ID NO: 132 
SEQ ID NO: 133 



351 eilwnegrdrwahdlldaalakilanaraagfdvhsencUliapddqlir 

12 ■ 

17 iiin 

I N 

4 01 alyasipcepvdaeypjaf iiyt sgstgkpkgvlhvhggyvagwhfclrvs 

12 

21 ~ ~™ 



SEQ ID NO: 39 451 fdaepgdtiyviadpgwitgqByml'tatmagxltgviaegsplfpsagry 

SEQ ID NO: 132 12 ■ 

SEQ ID NO: 133 21 



SEQ ID NO: 39 501 asiierygvqifkagvtf Iktvmsnpqnvedvrlydmhalrvatf caepv 

SEQ ID NO: 132 12 — ~~ 

SEQ ID NO: 133 21 ' 
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SEQ ID NO:39 
SEQ ID DO: 132 
ID HO: 133 



551 spavqqf graqimtpqy insyva tehggi vw thf y gnqdUBplrpdahtypl 

12 

2 1 • — . — -rpka — 

RP A 



SEQ ID 110:39 601 pvmagdvwvaetdesgttryrvadfdekgeivitapypyltrtlwgdvpg 

SEQ ID HO: 132 12 

SEQ ID NO: 133 25 ; - 



SEQ ID NO: 39 651 feaylrgeiplrawkgdaerfvktywrrgpngewgyiqgdfaikypdgsf 

SEQ ID NO: 132 12 

SEQ ID NO: 133 25 ; 

SEQ ID NO: 39 701 tlhgrpddvinveghrmgteeiegailrdrqitpdspvgnciwgaphre 

SEQ ID NO: 132 12 

SEQ ID NO: 133 25 



SEQ ID NO:39 
SEQ ID NO: 132 
SEQ ID NO: 133 



751 
12 
25. 



kgltpvafiqpapgrhltgadrrrldelvrtekgavsvpedyievsafpe 



SEQ ID NO:39 
-SEQ ID NO: 132 
SEQ ID NO: 133 



801 
12 
25 



tragkymrrflrxnnnacieplgdtttlxTspevleetaakiaewkrrqrmae 



SEQ id NO: 39 851 eqqiieryryfrieyhpptasagklavvtvtnpp-vnalneraldelntt 

SEQ ID NO: 132 12 — gkva wtinxpkalnalnsdt 1 kemdyv 

SEQ ID NO: 133 25 lnalnyetlkeldsv 

GK AWT P NALN L EL 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



900 vdhlarrqdvaalvf tgqgar s f vagadirqlleeihtve-eamalpnaa 
4 0 igeiendsevlaviltgageksf vagadisem-kenmtiegrkf gilgnk 
40 ldivendkeikvliitgsgektfvagadiaemsn — mtpl-eakkf slyg 
D . V A TOG SFVAGADI E T E EA N 



94 9 hlaf r kiemmkpciaaingvalggglef amachyrvadvyaef gqpeln 
8 9 — vf r r lelle kpviaavngf algggceiams cdir ias enar f gqpevg 
87 qkvfrkleml6kpviaavngfalgggcelsmacdlria8knakfgqpevg 
FRKIE KP IAA NG ALGGG E AMAC R A . A 



999 lrllpgyggtqrlprllykmngtgllralemllggrsvpadealklgli 

137 lgitpgfggtqrlsrlv gmgmakqliftaqnikadealriglv 

137 Igiipgf sgtqrlprll gtskakellftgdmlnsdeaykigli 

L PG GGTQRLPRL G A E I G ADEALK GLI 

104 9 daiatgdqda Is lacalaraaigadgqliesaavtqaf rhrheqldewrk 

180 n 

180 slew— — — 



SEQ ID NO: 39 1099 p^prfaddelrsiiahprieriirqahtvgrdaavhraldairygiihgf 

SEQ ID NO: 132 181 

SEQ ID NO: 133 184 elsdli. 

EL I 
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SEQ ID NO: 39 1149 eagleheaklf aeawdpnggkrgiref ldrqsaplptrrplitpeqeql 

SEQ ID NO: 132 181 kweps el 

SEQ ID NO: 133 190 eeakklak ■ 

£AK A W P L 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



1199 lrdqkellpvgspffpgvdripkwqyaqavird^dtgaaahgdpivaekq 

18 9 mntakei — 

198 kmmsksq 

KE Q 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



1249 i ivpverpr anqa liyvlas eynf ndiwai tgipvsr f dehdrdwhvtga 

205 i ■ : 1 

I AN PV 

1299 ggiglivalgeearregrlJcvgdlvaiysgqsdllsplmgldpmaadfvi 
207 — ~»— -vklskqain r g m - — — - — — — ■ — — - 

206 aislakeainkg- 

V L EA G 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO:39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO:39 
SEQ ID NO: 132 
SEQ ID NO:133 



1349 qgndtpdgshqqfmlaqapqclplptdmsieaagsyilnlgt iyralf tt 
219 : — -— qc-didtalafeaea fgecfat 

218 QC TtD E F T 



1399 
240 
224 



lqikagrtifiegaatgtgldaareaarnglrvigmveaearaatllaag 
edqkdamtafie- 



I FIE 



-tgntieaekfal- 
TG A 



1449 
252 
236 



ahgainrkdpevadcftrvpedpsawaaweaagqpllamfraqndgrlad 

eft 

CFT 

1499 ywahagetafprafqllgeprdghiptltfygatsgyhftflgkpgsas 

252 : 

239 



SEQ ID NO:39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ ID NO: 132 
SEQ ID NO: 133 



SEQ ID NO: 39 
SEQ. ID NO: 132 
SEQ ID NO: 133 



1549 ptemlrranlrageavliyygvgsddlvdtggleaieaarcpgariwvt 

252 . 

239 ~ 

1599 vadaqrefvlslgfgaalrgwslaelkrrfgdefewprtnpp^pnarqd 

252 ^ krk 

239 -tddqke gmiafae-kr~ 

D Q E G E KR 

1649 pqglkeavr r f ndlvf kplgaa vgvf lr aadxipr gypdllieraahdala 

255 ** 

254 



IE 



1699 vs; 
257 
254 



lamlikpftgrivyfediggrrysffapqiwvrqrriynptaqifgthl 



-apk- 
AP 



fgk- 
FG 
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SEQ ID NO: 39 1749 snayeilrlndeisaglltitepawpwdelpeahqamwenrhtaatyw 

SEQ ID NO: 132 257 - 

SEQ ID NO: 133 260 ; 

SEQ ID NO: 39 1799 nhalprlglknrdelyeavtager 

SEQ ID NO: 132 257 gfknr 

SEQ IP NO: 133 260 ; ~ 

G KNR 
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Figure 42 



SEQ ID HO: 39 1 midtaplappraprsnpirdrvdweaqraaaladpgafhgaiartvlhwy 

SEQ ID NO: 134 1 

SEQ ID NO: 135 1 



SEQ ID NO: 39 51 dpqhhcwirfnessqrwegldaatgapvtvdypadyqpwqqafddseapf 

SEQ ID NO: 134 1 xnaasaap 

SEQ ID NO: 135 1 

AA AP 

SEQ ID NO: 39 101 yrwfsggltnacfnevdrhvwmgygdevayyfegdrwdnslnngjrggpw 

SEQ ID NO: 134 B — 

SEQ ID NO: 135 1 

SEQ ID NO: 39 151 qetitrrrllvevvkaagvlrdlglkkgdrialniapnltopqiyyteaakr 

SEQ ID NO: 134 8 — 

SEQ ID NO: 135 1 ' 



SEQ ID NO: 39 201 lgtlytpvf ggfsdktlsdrihnagarvvitsdgayrnaqvvpykeaytd 

SEQ ID NO:13< 8 awtg 

SEQ ID NO:135 1 ; 

AT 

SEQ ID NO: 39 251 qaldJcyipvetaqalvaqtlatlpltesqrqtiiteveaalageitvers 

SEQ ID NO: 134 12 q— r taeak 

SEQ ID NO: 135 1 mtiqtlettalJtd- 



SEQ ID NO:39 
SEQ ID NO: 134 
ID NO: 135 



OTX* T L 



T E 



301 dvmrgvgBalaklrdldasvqakvrtvlaqalve appi v eawwrht gq 

18 d 

14 



SEQ ID NO: 39 351 eilwnegrdrwshdlldaalakilanaraagfdvhsendllnlpddqlir 

SEQ ID NO:134 19 

SEQ ID NO: 135 14 

SEQ ID NO: 39 401 alyasipcepvdaeypmfiiytsgstgkpkgvihvhggyvagwfatlrvs 

SEQ ID NO: 134 19 

SEQ ID NO: 135 14 — i 

SEQ ID NO: 39 451 fdaepgdtiyvia^pgwitgqsymltatmagrltgviaegsplfpsagry 

SEQ ID NO: 134 19. 

SEQ ID NO: 135 14 

SEQ ID NO: 39 501 aaiierygvqif kagvtflktvmsnpqhvedvrlydmhslrvatfcaepv 

SEQ ID NO: 134 19 lyel 

SEQ ID NO: 135 14 lyci 

LY 
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SEQ ID NO: 39 551 spavqqfgmqimtpqyinsywatehggivwthfygnqdfplrpdahtypl 

SEQ ID NO: 134 23 

SEQ ID NO: 135 18 ■ 



SEQ ID NO: 39 €01 pwvmgdvwvaetdesgttryrvadfdekgeivitapypyltrtlwgdvpg 

SEQ ID NO: 134 23 

SEQ ID NO: 135 18 



SEQ ID NO: 39 651 feaylrgeiplrawkgdaerfvktywxxgpngewgyiqgdfaikypdgsf 

SEQ ID NO: 134 23 geip 

SEQ ID NO: 135 18 geip : 

GEIP . . 

SEQ ID NO: 39 701 tlhgrpddvlnvsghxmgteeiegallrdrqitpdspvgnciwgaphre 

SEQ ID NO: 134 27 

SEQ ID NO: 135 22 : 



SEQ ID NO: 39 751 kgltpvaf iqpapgrhltgadrrrldelvrtekgavsvpedylevsafpe 

SEQ ID NO: 134 27 

SEQ ID NO: 135 22 pafhv— . p k 

P H p 

SEQ ID NO: 39 801 trsgkymrrf lnunmldeplgdtttlrnpevleelaakiaewkrrqr 

SEO, ID-NO: 134 27 ^F-^f^^«--plg*~^, hvpalanyawairr- 

SEQ ID HO: 135 29 t myawsirk- 

T . PW3 AK W R 



SEQ ID NO: 39 851 eqqlieryxyfrleyhpptasagklawtvtnppviialneraldelntiif 
SEQ ID NO: 134 43 • 
SEQ ID NO: 135 36 • 



ER 

SEQ_ID NO: 39 901 dhlarrqdvaaivftg qQarsfvag adlrqlleeibtrgeeamalpnnAHl 

SEQ ID NO: 134 46 

-SEQ WHN&*135 38 ~ 



SEQ ID NO: 39 951 af rkienankpciaaingvalggglef amachyrvadvyaef gqpeinlr 

SEQ ID NO: 134 46 gppe 

SEQ ID NO: 135 38 exhgkp ! 

ER KP G PE 

SEQ ID NO: 39 1001 llpgyggtqrlprllykrnngtgllralemilggrsvpadealklglida 

SEQ ID NOtl34 50 

SEQ ID NO: 135 44 



SEQ ID NO: 39 1051 iatgdqdslslacalaraaigadgqliesaavtqaf rhrheqldewrkpd 

SEQ ID NO: 134 50 

SEQ ID NO: 135 44 tqamq 

TQA 

SEQ ID NO: 39 1101 prfaddelrsiiahprieriirqahtvgrdaavhraldairygilhgfea 

SEQ ID NO: 134 SO qsh- 

SEQ ID NO: 135 49 

Q H 
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SBQ ID NO: 39 1151 gleheaklfaeavvdpnggkrgirefldrqsaplptrrplitpeqeqll^ 

SEQ ID NO: 134 53 : r- 

SBQ -ID NO:13S 49 ! ~ — * ~~ — 

SEQ ID NO: 39 1201 dqkellpvgspffpgvdrip)cwqyaqavlrdpdtgaaahgc|pivaekqii 

SEQ ID NO: 134 53 -qlevlpv— — wel gd 

SEQ ID NO: 135 49 vewptweige- 



Q E LPV V P H GD 

SEQ ID NO: 39 1251 vpverpranqallyvlasevnfndlwaitgipvsrfdehdrdwhvtgsgg 

SEQ ID NO: 134 65 devlvyvmaagvnyngvwaglgepispfdvhkgeyhiagsda 

seo ID NO: 135 60 devlvlvmaagvnyngvwaalgepispldghkqpfhiagsda 

^ , I* YV A VN N HA 6PSFDB B GS 

SEQ ID NO: 39 1301 iglivalgeearregrlkvgdlvaiysgqsdllsp-lmgldpm-aadfv- 

SEQ ID NO: 134 107 sgivwkvgakvk rwkvgdevivhcnqddgddeecnggdpm-f sptgr 

SEQ ID NO: 135 102 sgivwkvgakvk rwklgdewihcnqddgddeecnggdpmf s&sqr- 

G G R KVGD V I QD G DEM 

SEQ ID NO: 39 1348 iqgndtpdgshqqfiBlaqapqclpiptdmsieaageyilnlgtiyralf- 
SEQ ID NO: 134 153 iwgyetgdgsfaqfcrvqBrqlmarpkhltweeaacytltlatayrmlfg 
SEQ ID NO: 135 148 iwgyetpdgsfaqf crvqsrqllprpkhl twees acytltlataymlfg 
I G TPD6S QF Q Q LP P EAYILTYRLT 

SEQ ID NO: 39 1397 -ttlqikagrtif iegaatgtgldaarsaarnglrvignivssssrastll 

SEQ ID NO: 134 203 haphtvrpgqDVliwgasgglgvfgvqlcaasganalavisdeskrdyvm 

SEQ ID NO: 135 198 hkphelkpgqnvlvwgasgglgyf atqlaavaganaigwssedkrefvX 

' U KG I GA G G A AA G IG VSS S L 

SEQ ID NO: 39 1446 aagahgainrkd^evadcftrvpedpsawaaweaagqpllamfraqndgr 

SEQ ID NO: 134 253 slgakgvinrkd fdc — -w ~ ~~ 

SEQ ID NO: 135 248 smgakavlnrge fncwgqlpk 

GA G INRKD DC P 

SEQ ID NO: 39 1496 ladywsbagetafprsfqllgeprdghtptltfygatsgyhftf Igkpg 

SEQ ID NO: 134 269 gqlptv - 

SEQ ID NO: 135 269 T~ZZ "^F^ 

G PT G F 

SEQ ID NO: 39 1546 sasptenarranlrageavliyygvgsddl vdtgglealeaarqm gariv 

SBQ ID NO: 134 275 ~~ ~ 

SEQ ID NO: 135 275 - ■ ~ 

SEQ ID NO: 39 1596 vvtvsdaqrefvlslgfgaalrgvvslAeUa^fgdefewprtinpplpM 

SEQ ID NO: 134 275 "™ "T ** 

SEQ ID NO: 135 275 ndymke- srkfgkai-vqit — 

D E R 96 V I » 



SEQ ID NO:39 1646 rq(%)qglkeavrrfndlvfl5)lgsavgv£lrsadx«>rgypdliieraahd 

SEQ ID NO: 134 277 peyntwlkea-rkfgkaiwditgkgndv- divfehpgea 

SEQ ID NO: 135 293 gnkd* * dmvfehpgeq 

GLKEA R F G V . D E 

SEQ ID NO: 39 1696 alavsamlikpftgrivyfediggrrysffapqiwvrqrriyitptaqifg 

SEQ ID NO: 134 314 tfpvstlvakr-gggdvfcagttgfnitfdaryvwmrqk rlq - g 

SEO ID NO: 135 308 tfpvsvf Ivkr-ggmwicagttgf nltmdarf lwmrqkrvq— ~g 

VSLKGIV GFAWRQRI G 
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SEQ ID NO;39 1746 thlsnayeilrlndeisaglltitepawpwdelpeahqamwenrhtaat 
SEQ ID NO: 134 356 shfahlkqasaangfvindrrvdpcmsevfpwdkipaahtkinwknghppgn 
SEQ ID NO: 135 350 shfanlmqasaanqlvidrrvdpclsevfpwdqipaahekmlanqhlpgn 
HN N V PWD P AH MW N H 

SEQ ID NO: 39 1796 ywnhalprlglknrdelyeawtager 
SEQ ID NO: 134 406 mavlvnstraglrtvedvieagplkam 
SEQ ID. NO: 135 400 mavlvcaqrpglrtfeevqelsgap— 
V R GL E EA A 
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Figure 49 

ATGGCGACGGGGGAGTCCATGAGCGGAACftGGACGACTGGCAGGAAAGATTGGGTTAATT 

ACCGGTGGCGCCGGCAATATCGGCAGTGAATTGACACGTCGCTTTCTCGCAGAGGGAGCG 

ACGGTCATTATTAGTGGACGGAATCGGGCGAAGTTGACCGCACTGGCCGAACGGATGCAG 

t5CAGAGGCAGGAGTGCCGGCAAA<3CGCATCGATCTCGAAGTCATGGATGGGAGTGATCCG 

GTCGCGGTACGTGCCGGTATCGAAGCGATTGTGGCCCGTCACGGCCAGATCGACATTCTG 

GTCAACAATGCAGGAAGTGCCGGTGCCCAGCGTCGTCTGGCCGAGATTCCACTCACTGAA 

GCTGAATTAGGCCCTGGCGCCGAAGAGACGCTTCATGCCAGCATCGCCAATTTACTTGGT 

ATGGGATGGCATCTGATGCGTATTGCGGCACCTCATATGCCGGTAGGAAGTGCGGTCATC 

AATGTCTCGACCATCTTTTCACGGGCTGAGTACTACGGGCGGATTCCGTATGTCACCCCT 

AAAGCTGCTCTTAATGCTCTATCTCAACTTGCTGCGCGTGAGTTAGGTGCACGTGGCATC 

CGCGTTAATACGATCTTTCCCGGCCCGATTGAAAGTGATCGCATCCGTACAGTGTTCCAG 

CGTATGGATCAGCTCAAGGGGCGGCCGGAAGGCGACACAGCGCACCArTTTTTGAACACC 

ATCCGATTGTGTCGTGCCAACGACOVGGGCGCGCTTGAACGTCGGTTCCCCTGCGTCGGT 

GATGTGGCAGAGGGCGCTGTCTTTCTGGCCAGTGCCGAATCCGCCGCTCTCTCCGGTGAG 

ACGATTGAGGTTACGCACGGAATGGAGTTGCCGGCCTGCAGTGAGACCAGCCTGCTGGCC 

CGTACTGATCTGCGCACGATTGATGCCAGTGGCCGCACGACGCTCATCTGCGCCGGCGAC 

CAGATTGAAGAGGTGATGGCGCTCACCGGTATGTTGCGTACCTGTGGGAGTGAAGTGATC 

ATCGGCTTCCGTTCGGCTGCGGCGCTGGCCCAGTTCGAGCAGGCAGTCAATGAGAGTGGG 

CGGCTGGCCGGCGCAGACTTTACGCCTCCCATTC^CTTGCC^CTCGATCCACGCGATCCG 

GCAACAATTGACGCTGTCTTCGATTGGGCCGGCGAGAATACCGGCGGGATTCATCCW3CG 

GTGATTCTGCCTGCTACCAGTCACGAACCGGCACGGTGCGTGATTGAGGTTGATGATGAG 

CGGGTGCTGAATTTTCTGGCCGATGAAATCACCGGGACAATTGTGATTGCCAGTCGCCTG 

GCCCGTTACTGGCAGTCGCAACGGCTTACCCCCGGCGCACGTGCGCGTGGGCCGCGTGTC 

ATTTTTCTCTCGAACGGTGCCGATCAAAATGGGAATGTTTACGGACGCATTCAAAGTGCC 

GCTATCGGTCAGCTCATTCGTGTGTGGCGTCACGAGGCTGAACTTGACTATCAGCGTGCC 

AGCGCCGCCGGTGATCATGTGCTGCCGCCGGTATGGGC5CAATCAGATTGTGCGCTTGGCT 

AACCGCAGCCTTGAAGGGTTAGAATTTGCCTGTGCCTGGACAGCTCAATTGCTCCATAGT 

CAACGCCATATCAATGAGATTACCCTCAACATCCCTGCCAACATTAGCGCCACCACCGGC 

GCACGCAGTGCATCGGTCGGATGGGCGGAAAGCCTGATCGGGTTGCATTTGGGGAAAGTT 

GCCTTGATTACCGGTGGCAGCGCCGGTATTGGTGGGCAGATCGGGCGCCTCCTGGCTTTG 

AGTGGCGCGCGCGTGATGCTGGCAGCCCGTGATCGGCATAAGCTCGAACAGATGCAGGCG 

ATGATCCAATCTGAGCTGGCTGAGGTGGGGTATACCGATGTCGAAGATCGCGTCCACATT 

GCACCGGGCTGCGATGTGAGTAGCGAAGCGCAGCTTGCGGATCTTGTTGAACGTACCCTG 

TCAGCTTTTGGCACCGTCGATTATCTGATCAACAACGCCGGGATCGCCGGTGTCGAAGAG 

ATGGTTATCGATATGCCAGTTGAGGGATGGCGCCATACCCTCTTCGCCAATCTGATCAGC 

AACTACTCGTTGATGCGCAAACTGGCGGGGTTGATGAAAAAACAGGGTAGCGGTTACATC 

CTTAACGTCTCATCATACTTTGGCGGTGAAAAAGATGCGGCCATTCCCTACCCCAACCGT 

GCCGATTACGCCGTCTCGAAGGCTGGTCAGCGGGCAATGGCCGAAGTCTTTGCGCGCTTC 

CTTGGCCCGGAGATACAGATCAATGCCATTGCGCCGGGTCCGGTCGAAGGTGATCGCTTG 

CGCGGTACCGGTGAACGTCCCGGCCTCTTTGCCCGTCGGGCGCGGCTGATTTTGGAGAAC 

AAGCGGCTGAATGAGCTTCACGCTGCTCTTATCGCGGCTGCGCGCACCGATGAGCGATCT 

ATGCACGAACTGGTTGAACTGCTCTTACCCAATGATGTGGCCGCACTAGAGCAGAATCCC 

GCAGCACCTACCGCGTTGCGTGAACTGGCACGACGTTTTCGCAGCGAAGGCGATCCGGCG 

GCATCATCAAGCAGTGCGCTGCTGAAGCGTTCAATTGGCGCTAAATTGCTGGCTCGTTTG 

CATAATGGTGGCTATGTGTTGCCTGCCGAC^TCTTTGCAAACCTGCCAAACCCGCCCGAT 

ccx:ttcttcacccgagcccagattgatcgcgaggctcgcaaggttcgtgacggcatcatg 

GGGATGCTCTACCTGCAAt^ATGCCGACTGAGTTTGATGTCGCAATGGCCACCGTCTAT 
TACCTTGCCGA<XGCAATGTCAGTGGTGAGACATTCCACC 
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gaacgcac<xx:taccggtggcgaactct^ 

ctggtcggaagcacggtctatctgataggtgaacatctgactgaacaccttaacctgctt 
<5gccgtgcgtacctcgaacgttacggggcacgtcaggta<3tgatgattgttgagacagaa 
accggggcagagacaatgcgtcgcttgctccacgatcacgtcgaggctggtc<3gctgatg 
actattgtggccggtgatcagatcgaagccgctatcgaccaggctatcactcgctacggt 
cgcccagggccggtcgtctgtacccccttccggccactgccgacggtaccactggtcggg 
cgtaaagacagtgactggagcacagtgttgagtgaggctgaatttgccgagttgtgcgaa 
cacxagct^cccaccatttccgggtagcgggcaagattgccctgagtgatggtgccagt 
ctcgg^tggtcactgccgaaactacggctacctcaactaccgaggaatttx^tctcgct 
aacttcatcaaaacgacccttcac^ 

-actgctcagcgcattctgatcaatcaagtggatctgagccggcotgcgcgtgccgaagag 

CCGCGTGATCCGCACGAGCGTCAACAAGAACTGGAACGTTTTATCGAGGCAGTCTTGCT<5 
GTCACTGCACCACTCCCGCCTGAAGCCGATACCCGTTACGCCGGGCGGATTCATCGCGGA • 

CGGGCGATTAGCGTGTAA (SEQ ID NO: 140) 



SUBSTITUTE SHEET (RULE 2Q 

BNSOOCID: <WO 024241 8A£JA> 



WO02/04241T PCTAJS01/43607 

i. * 

84/105 

Figure SO 

MATGESMSGTGRIAGKIM.ITGGAGNKSELTRRFIAEGATVIISGRNRAKLTALAERMQ 

AEAGVPAKRIDLEVMDGSDPVAVWVGIEAIVARHGQIDILViroAGSAGAQRRLAEIPLTE 

AELGPGAEETUiASIAHUXSMGWHLMRIAAPHMPVGSAVINVSTIFSRAEYYGRIPYVTP . 

KAALNALSQIAARELGARGIRVNTIFPGPIESDRIRTVFQRMDQIjKGRPEGDTAHHFIiNT 

MRLCPJ^DQGALERRFPSVGDVADAAVFLASAESAALSGETIEVTHGMELPACSETSIiLA 

RTDLRTIDASGRTTIICAGDQIEEVMALTGMLRTCGSEVIICFRSAAALAQFEQAVNESR 

RLAGADFTPPIALPLDPRDPATIDAVFDWAGENTGGIHAAVILPATSHEPAPCVIEVDDE 

R^NFLADEITGTIVIASRLARyWQSQRLTPGARARGPRVIFLSNGADQNGNVYGRIQSA 

AIGOLIRVWRHEAELDYQRASAAGDHVLPPWJANQIVRFAHRSLEGLEFACAWTAQLLHS 

QRHINEITLNI PANI SATTGARSASVGWAESLIGLHLGKVALITGGSAGIGGQIGRLLAL 
SGARVMIAARDRHKLEQMQAMIQSELAEVGYTDVEDRVH1APGCDVSSEAQEADLVERTL 

S AFGTVDYLINN AGI AGVEEMVI DMPVEGWRHTLFANLI SNYSLMRKLAPIiMKKQGSGYI 
L1WSSYFGGEKDAAIPYPNPADYAVSKAGQPAMAEVFARFLGPEIQIHAIAPGPVEGDRL 
RGTGERPGLFARRARLILENKRLNELHAALIAAARTDERSMHELVEL^ 
AAPTALRElARRFP^EGDPAASSSSALLNRSIAAKLIJUUJiNGGYVLPADIFANLPHPPD 

PFFTRAQIDREARKVPJXsIMGMLYLQRMPTEFDVAMATVYYIA^ 

F.RT PT GGEL FGL PS PERIAEliVGST VYLI GEHLTEHLNLLARAYLERYG ARQVVMI VETE 
TGJ^TMRRLLHDHVEAGRI»MTIVAGDQIEAAID^ITRYGRP<3F 

RK^SDWSTVLSEAEFAELCEHQLTHHFRVARKISLSDGASIiALVTPETTATSTTEQFALA 
NFIKTTLHAFTATIGVESERTAQRILINQVDLTRRARAEEPRDPHERQQELERFIEAVIi 

VTAPLPPEADTRYAGRIHRGRAITV (SEQ ID NO: 141) 
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Figure 51 

TCTTTCTGGCCAGTGCOSAATGCGCCGCTCTCTCCGGTGAGACGATTGAGGTTACGCACG 
GAATGGAGTTGCCGGCCTGCRGTGAGACCAGCCTGCTGGCeCGTACTGATCTGCGCACGA 
TTGATGCCAGTGGCCGCACGACGCTCATCTGCGCCGGCGACCAGATTGkAGAGGTGATGG 
CGCTCACCGGTATGTTGCGTACCTGTGGGAGTGAAGTGATCATCGGCTTCCGTTCGGCTG 
CGGCGCTCGCCCAOTTCGAGCAGGCAGTCAATGAGAGTCQGCGGCTGGCCGGCGCAGftCT 
TTACGCCTCCCATTGCCTTGCCACTGGATCCACGCG (SEQ ID «0:142) 
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ID NO:141 
ID NO:143 
ID NQ:144 
ID NO: 145 
ID NO: 146 
ID NO: 147 



Figure 52 

atgesmsgtgrlagUalitggagnigeeltrrflaegatvitsgrnra 

mfankwlvtggssgigaatveafvkegasvafvgrnqa 

rarlegkvclitgaasgigkattllfaqegatviagdiske 



-^mekf- 



JSEQ ID NO: 141 
SEQ, ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 

SEQ ID NOU41 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
-SEQ ID NO:146 
SEQ ID NO: 147 

SEQ ID NO: 141 
SEQ ID NO: 143 
"SEQ ID HO: OT "58" 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 

SEQ ID NO: 141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 



1 i 
1 
1 
1 

\ mrllhkrtlvtggsdgiglaiaeaflsegadvlivgrdaa 

51 kltalaermaa--e-agvpakrldlevindgs(^vavragieaivarhgql 
"4^]ateve S ^q--hgan^ 

41 nldslvk—ea~e~glp — ~ ' Z g 

i ■ — ~_. 

41 kleaarqklaalgq-aga vetssadlatslgvatweqvketgrpl 

98 dilvnnagsagaqrrlaeiplteaelgpgaeetlhaela^gmgwhlnr 
83 dvlvnnagil rfaav— leptllqtfdetn ntnlrp w— -llts 

57 d ZZZ 

1 ~" ~~ 

86 dtpILiag^adl vpf esv scaqf qhsf alnvaaaff Xtq 

148 iaaphm^vgsavinvstifar-aeyygri^^ 
123 iaiphliatkg8ivnv8Silstivripglms--y6V8kaaii«Sh£tklaea 



-p — yv- 



125 gllphf^gaslinissyfar-^ 
194 elgargirvntifpgpiesdrirtvf*^^ 

171 elapsgvrvnsimpgpv- — _ 

€4 tdr- 
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173 
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SEQ ID NO:141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
-SEQ -ID-NO :447- 

SEQ ID NO: 141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 

SEQ ID NO: 141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 
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244 cri - 

1B8 itdla ~T~~ ~_ 

67 ' ' I I I 

16 ' " 

9 *P r ~ 

^-192 -— -amrr--TT~~^~~~~"~" ~ . 

294 setsllartdlrtidasgrttlicagdqieevnmltgmlrtcgseviJ^f 

193 ■ " ~~ ~ ~ ' ~~ 
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196 «:vd- 

344 rsaaalaqfeqavnesrrlagadftpplalpldpr^a^davfdwa^ 

193— ' 
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SBQ ID NO: 141 394 tggihaavilpatshepapcvievddervlnfladeitgtiv^asrlary 

SEQ ID NO: 143 205 tg ahtp ; 

SEQ ID NO: 144 73 ' — ■ : 

SBQ ID NO: 145 29 lp — ' 

SBQ ID NO: 146 24 qplp " ' 

SEQ ID NO: 147 200 ; 

SEQ ID NO: 141 444 wqsqrltpgarargprviflsngadqngnvygriqaaaigqlirvwrhea 

SEQ ID NO: 143 211 ! 

SEQ ID NO: 144 73 



SEQ ID NO: 145 31 lsededyrge— gklk- 

SEQ ID NO: 146 26 : \ " 

SEQ ID NO : 147 200 

SEQ ID NO: 141 494 eldyqrasaagdhvlppvwanqivrfanraleglefacawtaqllhsqrti 

SEQ ID NO: 143 211 — ■ 

SEQ ID NO: 144 73 - 

SEQ ID NO: 145 45 — — — -— 

SEQ ID NO: 146 31 ensyqgsgrlkd r — 

SEQ ID NO: 147 200 : — ~— _ 

SEQ ID NO: 141 544 ineitlnipaniaattgarsasvgwaealiglhlg^alltggeagiggq 

SBQ ID NO: 143 211 lgkaa— 

SEQ ID NO:144 73 , . " " 

SEQ ID NO: 145 45 gkvailtggdsgigra 

SEQ ID NO: 146 43 krali tggds glgra 

SEQ ID NO: 147 200 nlpa 1 — 

SEQ ID NO: 141 594 igrllalsgarvndaardrnk-leqmqamiqeelae^^ 

SEQ ID NO: 143 216 . " 

SEQ ID NO: 144 73 : — : "77" 

SEQ ID NO: 145 61 aaiafakegadisilyldehsdaeetrkrieke nvrcllip 

SEQ ID NO:146 58 vaiayaregadvlisylaehd damatkalve eagrkavlaa 

SEQ ID NO: 147 204 ' 

SEQ ID NO: 141 643 gcdvsseaqladlvertlsaf gtvdylinnagiagveemvldn^vegwrh 

SEQ ID NO: 143 219 eiadml 1 " *" "~~ 

SEQ ID NO: 144 73 vekwqkygridvlvxmagitr-dallvnaKeedwda 

SEQ ID NO: 145 102 g-dvgdenhceqavqqtvdhfgkldilvimaaeqhpqd8ilnieteqlek 
SEQ ID NO: 146 99 g-diqssdhcrrivetavrelggidilvnnaahqatfkniedisdeewel 

SEQ ID NO: 147 204 eakaelkayvers — 

SEQ ID NO: 141 693 tlfanllsnyaljarklaplxakkqgagyilnvasyfggekdaaipypnrad 

SEQ ID NO: 143 225 " 7" "77 

SEQ ID NO: 144 109 vinvnlkgvfnvtqmwpymikqrngsivnvaewg iygnpgqtn 

SEQ ID NO: 145 151 tfrtnif smfhmtkkalphl— qegcaitnttsitayegdtal Id 

SEQ ID NO: 146 148 tfrvzanhamfyltkaavphmkk-gsa-iintasi nadvpnpilla 

SEQ ID NO: 147 217 

SEQ ID NO: 141 743 yavakaggramaevfarfl-gpe-lglnalap gpvegdrlrg tgerpglf 

SEQ ID NO: 143 225 ^ ' "• " 

SBQ ID NO: 144 154 yaaskagvigmtktwakelagra-irvnavapgtie — 

SEQ ID NOrl45 194 yastkgaivsftramakal-adkgirvnavapgpi 

SEQ ID NO: 146 191 yattkgaihnf aagla<pul-aergirvnwapgpi- 

SEQ ID NO: 147 217 yplgrigr 
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~wtp 

-wtplipstmpedtva-df gk 
* — ---pddlagm— : — 
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244 
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891 if anlpnppdpf f traqidrearkvrdgimgmlylqrtnptef dvamatvy 
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266 pmssy— — ■ -" - ---- — ; - 
232 



SEQ ID NO: 141 791 arrarlilenkrlnelhaaliaaartdersmhelvelllpndvaaleqiip 

SEQ ID NO: 143 225 * ■ 

SEQ ID NO:144 189 

SEQ ID NO: 145 . 228 

SEQ ID NO: 146 225 

SEQ ID NO: 147 225 

SEQ ID NO: 141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 

SEQ ID NO: 141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 

SEQ -ID ^NO: 14JU 941yladrnvsgetfhpsgglryertptggel , fglpsperlaelvgetvylig 

SEQ ID NO: 143 227 .lasdk : aksvtgscyi — 

SEQ ID NO: 144 189 tpmteklpekareta 

SEQ ID NO: 145 244 : - : — hgldtp 

SEQ ID NO: 146 271 vsgatiavtgg 

• ID NO: H? — -234 y l a ed eaawtoggi — 

991 «hl tehlnl lar ay 1 erygar qwroi ve tetgae tmrr llhdhveagr lm 

242 

204 — lsriplgrfgkpe — — evaqvi 

250 — : '■ 

282 

248 ~~ 

104 1 tivagdqieaaidqai trygrpgpwctpf rplptyplvgr kdsdwstvl 

242 « 

223 lflaadessyvtgqvi gidgglvi 

250 : — mgrpgqpv — 

282 kp*l- 

248 favdggyt- 



SEQ ID NO: 141 
SEQ ID NOrl43 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID NO: 146 
SEQ ID NO: 147 

SEQ ID NO: 141 
SEQ ID NO: 143 
SEQ ID NO: 144 
SEQ ID NO: 145 
SEQ ID N0:146 
SEQ ID NO: 147 

SEQ ID NO: 141 
SEQ ZD NO: 143 
SEQ ID NO:144 
SEQ ZD NO: 145 
SEQ ID NO: 146 
SEQ ZD NO: 147 

SEQ ID NO:14l 
SEQ ID NO: 143 
SEQ ID NO:144 
SEQ ID NO:145 
SEQ ID NO:146 
SEQ ID NO: 147 



1091 seaefaelcehqlthhfrvarkialsdgaelalvtpettatstteqfala 

242 mdnglalq— — — 

247 " 

258 eha gayvlXaadea 

286 k 

256 " ~ 

1141 nfikttlhaftatig^esertaqrilinqvdltrraraeepr<^>herqqe 

250 r 

247 

272 symtgqtlhvn — j -~ 

286 — r 

256 



\ 



ftMRpmnrv*wn n?424iaA2 IA> 
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SEQ ID NO: 141 1191 lerfieavllvtaplppeadtryagrihrgraitv 

SEQ ID NO:143 250 

SEQ ID 110:144 247 ~~ 

SEQ ID NO: 145 283 ggrfist 

SEQ ID NO:146 286 

SEQ ID NO: 147 256 a 9 ■ 
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SEQ ID NO: 140 


1 i 


SEQ ID NO: 148 


1 ■ 


SEQ ID NO:149 


1 • 


SEO ID NO: 150 


1 ■ 


SEQ ID NO: 151 


1 • 


SEQ ID NO: 152 


1 • 


aciQ iu nu:i«u 




SEQ ID NO: 14a 




SEQ ID NO: 149 


lb 


SEQ ID NO: 150 




SEQ ID no: 131 


o 
o 


SEQ ID NO: 152 


1 


SEQ ID NO: 140 


100 


SEQ ID NO:148 


42 


SEQ ID NO: 149 


52 


SEQ ID NO: 150 


65 


SEQ ID NO: 151 


22 


SEQ ID NO: 152 


1 


SEQ ID NO: 140 


148 


SEQ ID NO: 148 


54 


SEQ ID NO: 149 


€3 


SEQ ID NO: ISO 


88 


SEQ ID NO: 151 


29 


SEQ ID NO: 152 


1 



FigureS 

1 atggcgacgggggagtccatgagcggaacaggacgactggcaggaaagat 

atga— -gacttctgcacaagcg 

atg—— -*--—-- — ---ttcgcaaataaagt 

— atgaggcttgaagggaaag — 

— — atggaaa 



tgtgtctgatcacagg- ggctgcaagcgggatagggaaa-gccacca 

— — aatttccgca— ~— — —--ccct 



-ggacggtatcgg 



rcagctactgt- 

cgcttcttttcgcacaggaag— 

ccctt— tc — — — 



gcgaagttgaccgcactggccgaacggatgcaggcagaggcaggagtgcc 

cc tggcaatcgcegaggcgttcctgagcgagg 

»~~ — ggaagcattc- - - - ■ ■ ■ 



gtgaacccaatgg acaga — caaacagaaggacaag 

SEQ ID NO: 140 198 ggcaaagcgcatcgatctcgaagtcatggatgggagtgatccggtcgcgg 

SEQ ID NO: 148 86 gcgc cgatgtcct 

SEQ ID NO: 149 73 gttaaggaagg — 

SEQ ID NO: 150 109 atctcga 

SEQ ID NO: 151 29 ' " 

SEQ ID NO: 152 35 — — aaccgcagc atcagg 

SEQ ID NO: 140 248 tac^gccggtatcgaagcgattgtggcccgtcac^gccagatcgacatt 

SEQ ID NO: 148 99 gatcgtcggccgtgacgcc 

SEQ ID NO: 149 84 cgcttctgtagccttcgtg 

SEQ ID NO: 150 116 ™ aagaaaatctc gactct 

SEQ ID NO: 151 29 cccgcca ~ 

SEQ ID NO:152 50 ' acagacagccgggcatt 

SEQ ID NO: 140 298 ctggtcaacaatgcaggaagtgccggtgcccagcgtcgtctggccgagat 

SEQ ID NO:148 118 ' ^ cc " 

SEQ ID NO: 149 103 ggaagaaaccaagccaag ; 

SEQ ID NO: 150 133 cttgtgaaagaggcagaagg ~ 

SEQ ID NO: 151 36 aacccaggaaatgcc 



SEQ id NO: 152 67 g-agtcaaaaatgaa tccgctgcc — 

SEQ ID NO: 140 348 tccactcactgaagctgaattaggccctggcgccgaagagacgcttcatg 

SEQ ID NO:148 121 aagct €q ^SSSZZ^Z 

SEQ ID NO: 149 121 cttaag— gaagtag —agagcpgc -zg 

SEQ ID NO: 150 153 

SEQ ID NO: 151 51 > - 

SEQ ID NO: 152 90 
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SEQ ID NOU40 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
-SEQ IO NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

TBQTDTK>:140 
SEQ ID NO:148 
SEQ ID NO:149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
— SBQ-XD TO:149 
SEQ ID NO: 150 
SEQ "ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID HO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 



398 ccageatcgccaatttacttggtatgggatggcatctgatgcgtattgcg 

138 ccagaagc tggcg 

144 ccagcagc — 

153 actt 

51 eg 

90 — : gctgtcagaggacgaggattatc 

448 gcacctcatatgccggtaggaagtgcggtca tcaatgtctcgaccatctt 

151 gc tettggeca 

152 - — — atggagccaacatc — 

157 cegg— ggaag — 



53 gcac- 
113 g 



-aggaa- 



4 98 ttcacgggctgagtactacgggcggattccgtatgtcacccctaaagctg 
ggc — 



162 



cagatgtctcc aaag 



a&ccg 

-aaaactg 



1^6 ctggctatcaaag 

166 

57 . tac — egateggatge- — 

119 — gegg- — 

548 ctcttaatgctctatctcaacttgctgcgcgtgagttaggtgcacgtggc 

— — eggege — : ggtggagacgtc 

^494- ■■■ -■ ■ ■ — -«»^cgagga 

166 " 

76 c _ tgeccgat -caegggg- 



130 aaaggaa 



-aagttg- 



598 atccgcgttaatacgatctttcccggcccgattgaaagtgatcgcat<»g 
cgatcttgcc 



183 gtccg 
201 age — 
166 gttgatccctacgtt 

92' 

143 ■ 



— gaaaatcatcgta— 
ttgaacgtgaccg- — 



-tcct 



-cgatcattactgg- 



648 tacagtgttccagcgtatggatcagctcaaggggcggcccgaaggcgaca 
199 " 




194 -acag 
101 accagggtt© 
156 — 



-aggegaca 



698 cagcgcaccattttttgaacaccatgcgattgtgtcgtgccaacgaccag 

199 -™™ 

217 



215 

124 

164 — 



— ttgtggaaaa-™ - 



— agtcgttcaaa 




-gacaag 



74 8 ggcgcgcttgaacgtcggttcccctccgtcggtgatgt ggcag acgccgc 
' — — ac 




gtcgaatc- 



-gatgt- 



catcatcaccggcgggga- 



ageggcate 
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SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 146 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO:140 
SEQ ID NO:148 
SEQ ID NO: 149 
SEQ 10 NO^ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO:148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 



798 tgtctttctggccagtgccgaatccgccgctctc tccggtgagacgattg 

207 cggtgtcgcaaccgtcg-tcgagcaggtgaaa 

225 tgtc gacaagttc gggaagcttg 

255 tctggtga— 



163 gg- 

164 — 



— cagggccgtggcga- 



— tcgcc— 



848 aggttacgcacggaatggagttgccggcctgcagtgagaccagcctgctg 

248 atgt ; 

263 

184 tatgcgcgcgagggag — —~ : c 

164 gcggaat -agggagagc * 

' 898 gcccgtactgatctgcgcacgattgatgccagtggccgcacgacgctcat 

248 — — ggccgctcgacattcct 

252 — « — gcttgtt aacaacgc— — 

2 $3 : — — --acaacgc— 

201 ggacgtccttatcagc tat 

180 ; 

948 ctgcgccggcgaccagattgaagaggtgatggcgctcaccggtatgttgc 

265 .at caacaatg — — — ccggt 

267 " . 

270 



220 ctgag — 
180 



cgagcatgacgacgcgatggccaccaaggct- 



998 gtacctgtgggagtgaagtgatcatcggcttccgttcggctgcggcgctg 

280 ■ — — —————— gtcgccgacctc 

267 tgggatt ctacggttcg 



— gggaat- 



270 

256 ctggtggag-gaag- 

180 — 



1048 gcccagttcgagcaggcagtcaatgagagtcggcggctggccggcgcaga 

292 gtgccgttcga — gagcgtcagcg aggcgca — 

284 cgagtgt tctggagccga 

269 caggtcgc-aaggccgt gcttgccgccggcga 

180 agcag : ~ — 

1098 ctttacgcctTCcattgccttgccactcgatccacgcgatccggcaacaa 

32i gttccagcactcc 

- — ataca — aactt 

— — — — — — — — aacaa 



302 cttta 

276 

300 c— 
185 



-atccagtcg-tccg acca 



— ctattgcctt- 



1148 ttgacgctg--tcttcgattgggccggcgagaataccggcg^gattcatg 
334 ttcgcgctc aatgtggcgg cggcg 

317 ttga : ~ 

281 gggatgc ■ gcttcttg 

318 ttgccgcaggatcgtcgaaacggccgttcgggaactcggcggcat 

195 
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SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO:151 
SEQ ID NO: 152 

SEQ ID NO:140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 



1196 cagcggtgattctgcctgctaccagtcacgaaccggcaccgtgcgtgatt 

359 ttcttcct cacc « 

321 tgaaact — i ~ 

296 _ 

353 — — ——--.-«-— 

195 tgcta ?— 

124 6 gaggttgatgatgagcgggtgctgaattttctggccgatgaaatcaccgg 

370 caggggctgctgccgcattt — 

328 atgaac . — acgaatttac— g 

296 tgag — gatgaaa-- 

200 - aagagggggctga— ' ■ 

1296 gacaattgtgattgccagtcgcctggcccgttactggcagtcgcaacggc 
390 ' 



345 tccagttgtcctcatcactagcctg- 

307 r~ 

364 gaca— -™ ~ 

213 



1346 ttacccccggcgcacgtgcgcgtgggccgcgtgtcatttttctctcgaac 

« 390 cggcgc r c 

370 ~ 

307 " 

3€8 ttctcgtcaac 

2i3 : tatctccattctat ac 

1396 ggtgccgatcaaaatgggaatgtttacggacgcattcaaagtgccgctat 

397 ggtgc r~ ; ""1^ 

370 : ■ gctafc 

307 gaagaagactgggatg \ 

379 aatgc 

229 ttagacgagca- 



— ttcggacgca- 



SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO:148 
SEQ ID NO:149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 



1446 cggtcagctcattcgtgtgtggcgtcacgaggctgaacttgactatcagc 

404 cgatca — — 

375 cccteatttgatt gctacaaaagggag * 

323 cggt aataaac 

384 

250 , . gagg aaac 

1496 gtgccagcgccgccggtgatcatgtgctgccgccggtatgggccaatcag 

410 " " 

402 " 



334 gtg 

384 

256 acgcaaacg- 



— gatc- 



— — -- aatc~ 
- — agcccatcag 
gaaaaggag 



154 6 attgtgc^cttcgctaacx^cagccttgaagggtta^tttgcctgtgc 

410 " ~ 3 

402 

341 tgaagggt ~~~~~~~Z 

394 gcgaccttcaag- — 4 



280 aatgtccgctgc- 



-i-ctgcttatcc 
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SEQ ID NO: 140 
SEQ ID 190:148 
SEQ ID HO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
"SEQ ID NO:148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
_SEa^D NO:152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO:149 
SEQ ID NO: 150 
SEQ ID NO:151 
SEQ ID NO:152 

SEQ ID NO:140 
SEQ ID NO:146 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 



— catagttaacg tatccagtata- 



-gttttcaacg 



94/105 

1596 ctggacagctcaattgctccatagtcaacgccatatcaatgagattaccc 
410 — 

402 

349 

406 " * " * 

302 cggga — ' " " — ■— ~ 

164 6 tcaacatccctgccaacattagcgccaccaccggcgcacgcagtgcatcg 

410 tcaacatctcttcctattt — — — cgcccgca- 

424 — — ctgtctacaatag- 

359 

406 -^aacatc-^"=gaagacatcagcga 
307 



gagga — 

-ttgggga- 



1696 gtcggatgggcggaaagcctgatcgggttgcatttggggaaagttgcctt 

437 

437 ■ ' 

359 

427 

307 gatg- 

174 6 gattaccggtggcagcgccggtattggtgggcagatcgggcgcctcctgg 

437 " 

437 : 

359 

432. -9*TO9 

318 : — 

1796 ctttgagtggcgcgcgcgtgatgctggcagcccgtgatcggcataagctc 

437 

437 ' ~ta& 

359 
437 
316 



-agctgacattccg™ - ""— 



1 8*4 6 gaacagatgcaggcgatgatccaatctgagctggctgaggtggggtatac 

437 agatgatcc — ~ — — 

440 ! gaatac 

359 ■ . — - -tgactcagatgg— — - 

'451 gtcaacatgcacgccatgttc tac 

2is cga-gaaccattgtgaacaagctg ■■■ ■ - 

•1896 cgatgtcgaagatcgcgtccacattgcaccgggctgcgatgtgagtagcg 

44 6 *™~~™°? 

446 C " 

371 r 

475 c— tgaocaag 

341 



— gcagcgg- 
-tgca 



1946 aagcgcagcttgcggatcttgttgaacgtaccctgtcagcttttggcacc 
— -— -— — — — — — — gccatc— — ~ cage 




345 gcaaacagtggac 



tgccgcacatgaagaa— 



-- gggcagc 
-attttggtaaa 
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SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID. NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 



1 996 gtcgat ta tctga-tcaacaacgccgggatcgccggtgtcgaagagatgg 

463 gtctactccctgt-ccaagggcgc 

447 7 *" 

371 

514 g -cga-tcatcaacaccg — 

370 ctcgat-atcttagtgaacaacgccg — 



2045 
486 
447 
371 
530 
395 



ttatcgatatgccagttgagggatggcgccataccctcttcgccaatctg 

gttga 

agggattatgtcatacagt 



-cttcca — - 



-tcaatgccgacgttcccaatccg 
— ctg 



SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO.U52 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID. NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO:146 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO:150 
SEQ ID NO: 151 
SEQ ID NO: 152 



2095 atcagcaactactcgttgatgcgcaaactggcgccgttgatgaaaaaaca 

491 actcgttga ■ 

466 

371 tggtgccctacatgatcaaaca 

559 ate — ■ — ctactcgcctatgcg accacca 

398 aacagcatc : ccca 



2145 gggtagcggttacatccttaacgtctcatcatactttggcggtgaaaaag 
500 — 

466 

393 gaggaacggttcgatcgtgaacgtctcctctgtcgtt 

584 agggegeg™ — ate cacaattt- 

411 ggacag cattctcaataCttcaaca- 

2195 atgcggccattccctaccccaaccgtgccgattacgccgtctcgaaggct 
500 — — — — — — ————————— -ccagatcgct 

46 6 gtgtcaaaggct 



435 ataegggaat- 
603 

436 



cctggtcagacgaattacgcggcgtcgaaggcg 

cagcgccg gtctcg > — 



2245 ggtcagcgggcaatggccgaagtctttgcgcgcttccttggoccg ga 

510 ggecttcgag- ctcggcccgcgcgg 

478 g— : : 

478 ggagtcataggaatgacc-aagacgt- 

617 cgcagatgctggccgaa- 

436 gaacagctggaa aaaacctttcgc 



— cgcg g- 



2292 gatacagatcaatgccattgcgccgggtccggtcgaaggtgatcgcttgc 
534 catccgcgtcaacgccatcgcgcccggcacggtcga — 

503 — - ■ — — — — — gggcgaaggaactcgct— — 

639 gataagagtgaatgtcgtggccccgggcccgatc— 
460 -acaaatattttttccat 



2342 gcggtaccggtgaacgtcccggcctctttgcccgtcgggcgcggctgatt 

570 : 

479 ~ 

520 -r 

6 73 tggacgccgctg—r 

477 
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SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ JD NO: 151 
SEQ ID N0:1S2 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 



2392 ttggagaacaagcggctgaatgagcttcacgctgctcttatcgcggctgc 
479 




— ggaagaaacatcagggtgaac 

atcccctccaccatgc — 

gtttca 



gctgt 



2442 gcgcaccgatgagcgatctatgcacgaactggttgaactgetcttaccca 

s70 : ^ 

— ctatg- — gatcacttcacaaaat— — — — — 



— cgga 



-tatg-acgaa- 



479 — 

546 g-gcacc- 

701 ccgagga- 

483 " 

2492 atgatgtggccgcactagagcagaatcccgcagcacctaccgcgttgcgt 

57q — — cacc — 

500 — : tggcagcgttggagctg gctccttqtggcgtgcga 

556 ttcat— 
708 
492 



- agaaac cccca t ga c- 



taccg-— — 
■gaaagctttgcct 



SEQ ID NO: 140 -2542-gaactggcacgacgttttcgcagcgaaggcgatccggcggcatcatcaag 

SEQ ID NO: 148 574 - ^gccatgcggcg— -- -caag 

SEQ ID 110:149 535 gr • ™ 

SEQ ID NO: 150 576 cgaaaaacttccag- 

SEQ ID NO: 151 713 tcgccgatttcg- 

10 NO: 152 SOScacctg — • 

2592 cagtgcgctgctgaaccgttcaattgccgctaaattgctggctcgtttgc 
539 — — accgt- — 



SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NQ:150 
SEQ ID NO: 151 
SEQ ID NO: 152 



536 

5£6 c— 
725 



-tgaac tcagt- 



Tccgtgaaacggcc- 



-gc 



515 aggggtg- 



tgccatta- 



— SEQ -ID NO : 140 — 2 642- afcaa^ggfeggctatgtgttgcctgccgacatctttgcaaacctgccaaac 
ceo m Mnoaft S9 4 cgac aacctgcc- 



SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID. NO: 152 

SEQ ID NO: 140 
^EQ^ID-,NO:148 
SEQ ID NO:149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID 110:152 

SEQ ID NO: 140 
SEQ ID NO:146 
SEQ ZD NO: 149 
SEQ ID NO:150 
SEQ ID NO: 151 
SEQ ZD NO: 152 



594 

"TUT=± 

610 

727 aaacaggtgcctatg- 
530 ttaat— 



tctttccaga- 



— acgacat- 



2692 ccgcccgatcccttcttcacccgagcccagattgatcgcgaggctcgcaa 

,606-r-^»-- — — ~~ - - . . 

—- gaccagttct— ■ ■■ -->■-• — — — — ~ — ■ ------ 

atacc gctgggaa 



554 
619 



742 
542 



-aa 



— — cgattaccgctt- 



2742 ggttcgtgacggcatcatggggatgctctacctgcaacggatgccgactg 
606 !: ggccga 

564 ; • tac 

632 ggtttgggaagccagaagagg " 

744 g 7 

554 atgaaggggat— 
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SEQ ID 110:140 
SEQ ID NO:148 
SEQ ID NO: 149 
SEQ ID HO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO:148 
SEQ ID NO:149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID. NO: 151 
SEQ ID* NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: ISO 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 



2792 agtttgatgtcgcaatggccaccgtctattaccttgccgaccgcaatgtc 

512 ggoca aggccgaactgaaggec 

567 tgatatcgc— " 

6 53 tggcgca » 

745 .___-_--————--——— --cgaccg— — — i— 
5^9 cgttaattgattattecagcacaaag — 

2842 agtggtgagaca-ttccacccatcaggtggtttgcgttacgaacgcaccc 
634 tatg— — ~— —--—-—— -—--^—— -—teg aacgcagc- 

576 

650 ggttatactcttcctcgcatcggacgagtcgagttacg — 

751 r 

595 ggtgcga— — — — — — -ttgtttcctttacg— — ™~ 

2891 c taccggtggcgaactcttcggct tgccctcaccggaacggctggcggag 
549 tatccgctgggccgcatcgg-ecgtccggacgac 

576 ' 

599 — WM — w- T — ■ — -■■ « 

751 ggccagc 



616 cgttccatggcgaagtc gcttgc- 



ag 

tcaccggacagg- 

— gtggaa 



2941 ctggtcggaagcacggtctatctgataggtgaacatctgactgaacacct 

682 ctcgccggcatggcggtttatct — — — — = 

578 ctggt tctggct- 

710 tgatag- 



766 ctcg- 
639 



— cctcggcctatgtcat- 



-agataaa- 



2991 taacctgcttgcccgtgcgtacctcgaacgttacggggcacgtcaggtag 

705 : : — 

590 : 

716 

786 

645 — -ggca 

3041 tgatgattgttgagacagaaaccggggcagagacaatgcgtcgcttgctc 

705 : 

590 — — ~ tttctc 

716 — ' r ~ 

786 ' 

650 — tcagagtgaatgcg-! 

3091 cacgatcacgtcgaggctggtcggctgatgactattgtggccggtgatca 

705 ™~ 

596 c 1 tgatct 

716 — 
786 — 
664 — 

3141 gatcgaagccgctatcgaccaggctatcactcgctacggtcgcccagggc 

705 agccagcgacgaggc— 

603 gcttgaag — — : 

716 ~'~ — 



-gctgg- 



-gtggcgcccggt- 



791 cggatccgatgtcga- 

676 ccgatttggacaccgct- 



gctac- 



SUBSTITUTE SHEET (RULE 26) 

RNRnnmrv <wn 02424 18A2 IA> 



WO 02/042418 



PCT/US01/43607 



98/105 



SEQ ID 110:140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID MO: 150 
SEQ ID HO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 14 8 
SEQ ID NO:149 
SEQ ID NO:150 
SEQ JD NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
' SEQ ID NO: 152 

SEQ ID NO: 140 3390 

SEQ ID NO: 146 757 

SEQ ID NO: 149 652 

SEQ ID NO: 150 726 

SEQ ID NO: 151 636 

SEQ ID NO: 152 611 

SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 

SEQ ID NO: 140 
-SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID MO: 152 



3191 cggtcgtctgtacccccttccggccactgrcgacggtaccactggtcggg 

720 - ~ ; 

611 1 - 

71* 

811 r 

693 tattccgg cgacattccctgagg 

3241 cgtaaagacagtgactggagcacagtgttgagt^aggctgaatttgccga 

720 — ggcctgga — — — cga 

611 atacaggg 

716 -gaat 

811 gtgtcaggcgca ' 

716 aaaaag tga-aacagcac — r; ggcttggatacccca 

3291 gttgtgcgaacaccagctcacccaccatttccgggtagcgcgcaagattg 

731 gcggtgggatc tttg 

619 gctcatacaccgt — - - — 

720 — 

e23 r — acgattg 



748 atgggaagaccgggacagcc- 



ggttgagc— 



334 1 ccctgagtgatggtgc-cagtctcgcgctggtcactcccgaaactacggc 

746 .ccgtg gatggt- ~~™ ~- • 

632 tggggaaagctgcgcagtct— ~~~~ Z 

720 agatgg ' " 

830 
776 



ccgtga— 

atgcaggcgc-ctatgtectgctggcgtctgacgaa 



tacctcaactaccgagcaatttgctctggctaacttcatcaaaacgaccc 
— gaggagattgct — — 



-tcttccta- 



344 0 ttcacgcttttacggctacgattggtgtcgagagcgaa agaac tgctcag 

757 ggcta 

664 ^atatgatt-— —— — 

726 — — — : 

836 ™ 

819 tatga— — — — — — 1 — cag 



34 90 cgcattctgatcaatcaagtcgatctgacccggcgtgcgcgtgccgaaga 

762 - - ■ — 

gtgtatctg : gctagtgataaagc 

•— gg 



-gaatg 



SEQ ID NO: 140 
SEQ ID NO: 148 
SEQ ID NO: 149 
SEQ ID NO: 150 
SEQ ID NO: 151 
SEQ ID 110:152 



673 ; 

726 : 

836 - 

827 ggca gaccattcatgt- 

3540 gccgcgtgatccgcacgagcgtcaacaagaactgg aacgtttt atcgagg 

762 — ; 1 

696 taagagtgtt -acggggtcctgttat— 

728 goctcgtgat — — 

836 



848 gcggc- 



-cgtttfcat- 
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SEQ ID NO: 140 3590 cagtcttgctggtcactgcaccactcccgcctgaagccgatacccgttac 

SEQ ID NO: 148 762 — ; 

SEQ ID NO: 149 721 atcatggacaatg gactcgcgc 

SEQ ID NO: 150 738 «tga \- 

SEQ ID NO: 151 836 ccggcggcaagcc 

SEQ ID NO: 152 8$1 

SEQ ID NO: 140 3640 gccgggcggattcatcgcggacgggcgattaccgtgtaa . 

SEQ ID NO: 148 762 cacggccggatga 

SEQ ID NO: 149 743 tgca gtaa 

SEQ ID NO: 150 742 ™ 

SEQ ID NO: 151 849 —tttpctttga- 

SEQ ID NO: 152 861 — ----- -ttcaac= \ — r~ gtaa 
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Figure 56 

1 MVGKKWHHL MMSAKDAHYT GNLVNGARIV NQWGDVGTEL 
41 MVYVDGDISL FLGYKDIEFT APVYVGDFME YHGWIEKVGN 
81 QSYTCKFEAW KVATMVDITN PQDTRATACE PPVLCGRATG 
121 SLFIAKKDQR GPQESSFKER KHPGE (SEQ ID NO: 160) 
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Figure 57 

1 MVGKKWHHL MMSAKDAHYT GNLVNGARIV NQWGDVGTEL 
41 MVYVDGDISL FLGYKDIEFT APVYVGDFME YHGWIEKVGN 
81 <JSYTCKFEAW KVAKMVDITN PQDTRATACE PPVLCGTATG 
121 SLFIAKDNQR GPQESSFKDA KHPQ <SEQ ID NO: 161) 
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Figure 58 



1 ATGGTAGGTA AAAAGGTTGT AC&TCATTTA ATGATGAGCG 

41 CAAAAGATGC TCACTATACT GGAAACTTAG TAAACGGGGC 

. 81 TAGAATTGTG AATCAGTGGG GCGACGTTGG TACAGAAITA " 

121 ATGGTTTATG TTGATGGTGA CATAAGCTTA TTCTTGGGCT 

161 ACAAAGATAT CGARTTCACA GCTCCTGTAT ATGTTGGTGA 

201 CTTTATGGAA TACCACGGCT GGATTGAAAA AGTTGGTAAC 

241 CAGTCCTATA CATGTAAATT TGAAGCATGG AAAGTTGCAA 

281 CAATGGTTGA TATCACAAAT CCTCAGGATA CACGCGCAAC 

321 AGCTTGTGAG CCTCCGGTAT TGTGCGGAAG AGCAACGGGT 

361 AGTTTGTTCA TCGCAAAAAA AGATCAGAGA GGCCCTCAGG 

401 AATCCTCTTT TAAAGAGAGA AAGCACCCCG GTGAATGA 
(SEQ ID N0:l€2) 
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Figure 59 

1 ATGGTAGGTA' AAAAGGTTGT ACATCATTTA ATGATGAGCG 

41 CAAAAGATGC TCACTATACT GGAAACTTAG TAAACGGCGC 

81 TAGAATTGTG AATCAGTGGG GCGAGGTAGG TACAGAATTA 

121 ATGGTTTATG TTGATGGTGA CATCAGCTTA TTCTT-GGGCT 

161 ACAAAGATAT CGAATTCACA GCTCCTGTAT ATGTTGGTGA 

201 TTTTATGGAA TACCACGGCT GGATTGAAAA AGTTGGCAAC 

241 CAGTCCTATA CATGTAAATT TGAAGCATGG AAAGTAGCAA 

281 AGATGGTTGA TATCACAAAT CCACAGGATA CACGTGCAAC 

321 AGCTTGTGAA CCTCCGGTAC TTTGTGGTAC TGCAACAGGC 

361 AGCCTTTTCA TCGCAAAGGA TAATCAGAGA GGTCCTCAGG 

401 AATCTTCCTT CAAGGATGCA AAGCACCCTC AATAA 
(SEQ ID N0:l€3) 
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2. Q Claim Nos.: 

because they relate to parts of the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out, specifically: 



3. Q Claim Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box II Observations where unity of invention is lacking (Continuation of Item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 
Please See Continuation Sheet 



] I As all required additional search fees were timely paid by the applicant, this iniernational search report covers all 
searchable claims. 

1 1 As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite 
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f""1 As only some of the required additional search fees were timely paid by the applicant, this international search report 
covers only those claims for which fees were paid, specifically claims Nos.: 



No required additional search fees were timely paid by the applicant. Consequently, this international search report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 1-42 & 44-47 (All partly) 
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BOX II. OBSERVATIONS WHERE UNITY OF INVENTION IS LACKING 

This application contains the following inventions or groups of inventions which arc not so linked as to form a single general inventive 
concept under PCT Rule 13.1. ln,order for all inventions to be searched, the appropriate additional search fees must be paid. 

Group I, claimtt) M2 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 1. host cell and the method of making the 
polypeptide. 

Group II, claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 9, host cell and the method of making the 



Group in, claim(s) M2 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 17, host cell and the method of making the 
polypeptide. 

Group IV, claim(s) I -42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 25, host cell and the method of making the 
polypeptide. 

Group V, claim(s) M2 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 33, host cell and the method of making the 
polypeptide. 

Group VI, claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 34, host cell and the method of making the 
polypeptide. 

Croup VII. claintfs) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 36, host cell and the method of making 
the polypeptide. 

Group Vffl, claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 38, host cell and the method of making 
the polypeptide. 

Group IX. claimfe) M2 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 40, host cell and the method of making the 
polypeptide. 

Group X. claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 42. host cell and the method of making the 
polypeptide. 

Group XI, claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 129. host cell and the method of making 
the polypeptide. 

Group Xn. claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 140, host cell and the method of making 
the polypeptide. 

Group XIII. claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 142, host cell and the method of making 
the polypeptide. 

Group XIV, claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 162, host cell and the method of making 
the polypeptide. 

Group XV, claim(s) 1-42 & 44-47 (Partially), drawn to nucleic acid sequence of SEQ ID NO : 163, host cell and the method of making 
the polypeptide. 

Group XVI, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 2. 
Group XVII, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 10. 
Group XVIII, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 18. 
<Jroup XIX, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 26. 
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Group XX, claim($) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 35. 

Group XXI, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 37. 

Group XXH, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 39. 

Group XXIII, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 41 . 

Group XXIV, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 141. 

Group XXV, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 160. 

Group XXVI, claim(s) 43 (Partially), drawn to binding agent to polypeptide of SEQ ID NO : 161. 

Group XXVII, claim(s) 48-59 & 78-90, drawn to a method of making 3-HP (3-hydroxypropionic acid). 

Group XXVIH, claim(s) 60-64, drawn to a method of making polymerized 3-HP (3-hydroxypropionic acid). 

Group XXDC, claim(s) 65-69, drawn to a method of making ester of 3-HP. 

Group XXX, claim(s) 70-73, drawn to a method of making polymerized acrylate. 

Group XXXI, claim(s) 74-77, drawn to a method of making ester of acrylate. 

The inventions listed as Groups I-XXXI do not relate to a single general inventive concept under PCT Rule 13. 1 because, under PCT 
Rule 13.2, they lack the same or corresponding special technical features for the following reasons: Group I has a special technical 
feature of a nucleic acid of SEQ ID NO : 1 , which Groups II-XXXI do not share. Group II has a special technical feature of a nucleic 

of SEQ ID NO : 9, which Groups I and 111-XXX! do not share. Group III has a special technical feature of a nucleic acid of SEQ ID 

NO : 17, which Groups MI and IV-XXXI do not share. Group IV has a special technical feature of a nucleic acid of SEQ ID NO : 25. 
which Groups I-IU and V-XXXI do not share. Group V has a special technical feature of a nucleic acid of SEQ ID NO : 33, which 
Groups I-IV and VI-XXXI do not share. Group VI has a special technical feature of a nucleic acid of SEQ ID NO : S4, which Groups I- 
V and VII-XXXI do not share. Group VII has a special technical feature of a nucleic acid of SEQ ID NO : 36. which Groups I-VI and 
VIII-XXXI do not share. Group VIII has a special technical feature of a nucleic acid of SEQ ID NO : 38, which Groups I-VII and XI- 
XXXI do not share. Group XI has a special technical feature of a nucleic acid of SEQ ID NO : 40, which Groups I- VIII and X-XXX1 do 
not share. Group X has a special technical feature of a nucleic acid of SEQ ID NO : 42, which Groups I-1X and Xl-XXXI do not share. 
Group XI has a special technical feature of a nucleic acid of SEQ ID NO : 129, which Groups I-X and XII-XXXI do not share. Group 

XII has a special technical feature of a nucleic acid of SEQ ID NO : 140, which Groups I-X! and XIII-XXXI do not share. Group XIII 

has a special technical feature of a nucleic acid of SEQ ID NO : 142, which Groups I-XH and XIV-XXXI do not share. Group XIV has a 
special technical feature of a nucleic acid of SEQ ID NO : 162, which Groups I-XIII and X V-XXXI do not share. Group XV has a 
special technical feature of a nucleic acid of SEQ ID NO : 163, which Groups I-XIV and X VI-XXXI do not share. Group XVI has a 
special technical feature of a a binding agent to polypeptide of SEQ ID NO : 2, which Groups I-XV and XVII-XXXI do not share. Group 
XVII has a special technical feature of a a binding agent to polypeptide of SEQ ID NO : 10, which Groups I-X VI and X VIII-XXXI do 
not share. Group XVIII has a special technical feature of a a binding agent to polypeptide of SEQ ID NO : 18, which Groups I-XVII and 
XIX-XXXI do not share. Group XIX has a special technical feature of a a binding agent to polypeptide of SEQ ID NO : 26, which 
_ Groups I-XVIII and XX-XXXI do not share. Group XX has a special technical45ature.jQf a a binding agent to polypeptide of SEQ ID NO 
: 35, which Groups I-XIX and XXI-XXXI do not share. Group XXI has a special technical feature of a a binding agent to polypeptide of 
SEQ ID NO : 37, which Groups I-XX and XX II-XXXI do not share. Group XXI! has a special technical feature of a a binding agent to 
polypeptide of SEQ ID NO : 39, which Groups I-XXI and XXni-XXXI do not share. Group XXIII has a special technical feature of a a 
binding agent to polypeptide of SEQ ID NO : 41, which Groups I-XXII and XXIV-XXXI do not share. Group XXIV has a special 
technical feature of a a binding agent to polypeptide of SEQ ID NO : 141 , which Groups I-XXIII and XXV-XXXI do not share. Group 
XXV has a special technical feature of a a binding agent to polypeptide of SEQ ID NO : 160, which Groups I-XXIV and XXVI-XXX1 
do not share. Group XXVI has a special technical feature of a a binding agent to polypeptide of SEQ ID NO : 161, which Groups I-XXV 
and XXV1I-XXXI do not share. Groups XXVII- XXXI are drawn to making of different products employing different method steps and 
end products and distinct among themselves as well with respect to the Groups I-XX VI which employ sequences that Groups XXVII- 
XXXI do not share. 
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